DOCTYPE html> Problem G

CIS 41A Unit G, Problem G

This assignment has one script with three parts.

At the top of each of your scripts, put the following multi-line comment with your information:

'''
Your name as registered, with optional nickname in parentheses
CIS 41A   Fall 2021
Unit G, Problem G
'''

All print output should include descriptions as shown in the example output below.
Your script should contain a main function.

Part One – Reading a data file

For this exercise, you will need to download the file States.txt from Canvas and save it into the same directory as your Python script. To do this, login to Canvas, select CIS 41A, select Files, select States.txt, select Download, and save into the same directory with your unit G take-home Python script.

The file has 50 lines of data, one for each state in the Unites States. Each line of data contains three pieces of data separated by a space: the two letter abbreviation of the state's name, the region that the state is in, and the 2016 population of the state.

You need to find and print the state with the highest population in the Midwest region.

Note: file States.txt is not a csv file - don't try to read it with a csv reader.

Example output:

Highest population state in the Midwest is: IL 12802000

Part Two – A Dictionary of Lists

Download the file USPresidents.txt from from Canvas and save it into the same directory as your Python script. To do this, login to Canvas, select CIS 41A, select Files, select USPresidents.txt, select Download, and save into the same directory with your unit G take-home Python script.

The file has 44 lines of data, one for each president in the history of the Unites States. Each line of data contains two pieces of data separated by a space: the two letter abbreviation of the name of the state where the president was born, and the name of the president (for your convenience, the president's name has been converted to a single string – George Washington has been converted to George_Washington).

Using the data from the file, you need to build a dictionary of states and the presidents born in those states. Each key will be a state abbreviation and each value will be a list of presidents. Use defaultdict to initialize the dictionary so that its default values are empty lists.

After building the dictionary, determine the state with the most presidents and how many presidents were born there. Print their names.

Example output:

The state with the most presidents is VA with 8 presidents:
George_Washington
James_Madison
James_Monroe
John_Tyler
And so on...

Part Three – Dictionary Keys and Sets

Using a dictionary comprehension, build a new dictionary from the data in the Part Two dictionary. Each key will again be a state abbreviation, however, the value will be the count of presidents from that state.

Create a set of the ten most populous US states (CA, TX, FL, NY, IL, PA, OH, GA, NC, MI).
Note - take this a a given - you do not have to find these ten states on your own.

Then, using a set operator, create a new set that represents a set of populous US states that have had presidents born in them.

Print a count of this new set along with an alpha-sorted listing of these states and how many presidents have been born in them.

Example output:

8 of the 10 high population states have had presidents born in them:
CA 1
GA 1
IL 1
NC 2 
And so on...

Add the following at the end of the script to show your results:

'''
Execution results:
paste execution results here
'''

Submit your finished py script in Canvas, Problem G.