I recently came across this dataset from the Federal Election Commission and wanted to explore it a bit to see how money affected the election. Each plot aims to try and answer a question that I thought of while looking through the available data. This post is part 1 of my exploration and visualization of this dataset.
Data is from the FEC
import pandas as pd
import numpy as np
import plotly.plotly as py
import plotly.figure_factory as ff
import plotly.graph_objs as go
import plotly.offline as offline
import cufflinks as cf
cf.set_config_file(theme='polar')
cf.go_online()
Preparing The Data
url = 'https://storage.googleapis.com/mholtzscher-datasets/2016%20FEC%20Presidential/contributors.csv'
df = pd.read_csv(url, index_col=False)
Matching Candidates To Their Parties
parties = {'Rubio, Marco': 'Republican',
'Santorum, Richard J.': 'Republican',
'Perry, James R. (Rick)': 'Republican',
'Carson, Benjamin S.': 'Republican',
"Cruz, Rafael Edward 'Ted'": 'Republican',
'Paul, Rand': 'Republican',
'Clinton, Hillary Rodham': 'Democrat',
'Sanders, Bernard': 'Democrat',
'Fiorina, Carly': 'Republican',
'Huckabee, Mike': 'Republican',
'Pataki, George E: .': 'Republican',
"O'Malley, Martin Joseph": 'Democrat',
'Graham, Lindsey O.': 'Republican',
'Bush, Jeb': 'Republican',
'Trump, Donald J.': 'Republican',
'Jindal, Bobby': 'Republican',
'Christie, Christopher J.': 'Republican',
'Walker, Scott': 'Republican',
'Stein, Jill': "Green Party",
'Webb, James Henry Jr.': 'Democrat',
'Kasich, John R.': 'Republican',
'Gilmore, James S III': 'Republican',
'Lessig, Lawrence': 'Democrat',
'Johnson, Gary': 'Libertarian',
'McMullin, Evan': 'Independent'}
df['party'] = df.cand_nm.map(parties)
Filtering Out Refunds
df = df[df.contb_receipt_amt > 0]
Cleaning Up Occupations
occ_mapping = {
'INFORMATION REQUESTED PER BEST EFFORTS' : 'NOT PROVIDED',
'INFORMATION REQUESTED' : 'NOT PROVIDED',
'INFORMATION REQUESTED (BEST EFFORTS)' : 'NOT PROVIDED',
'C.E.O.': 'CEO'
}
f = lambda x: occ_mapping.get(x, x)
df.contbr_occupation = df.contbr_occupation.map(f)
Create A DataFrame Filtered To The Final Major Candidates
dff = df[df.cand_nm.isin(['Trump, Donald J.','Clinton, Hillary Rodham'])]
Which Party Received The Most Contributions?
party = df.groupby('party').contb_receipt_amt.sum().to_frame().reset_index()
party.iplot(kind='pie', textinfo='value+percent', textposition='outside', labels='party',
values='contb_receipt_amt', filename='fec-2016-party-contributions',
title='2016 Total Contributions(USD) By Party')
Which Candidate Recieved The Most In Contributions?
candidate = df.groupby('cand_nm').contb_receipt_amt.sum().sort_values(ascending=False)
candidate.iplot(kind='bar', yTitle='Contributions(USD)',
title='2016 Contributions(USD) By Presidential Candidate', filename='fec-2016-candidate-contributions')
Which Occupations Donated The Most Number Of Times?
job = df['contbr_occupation'].value_counts()
job[:25].iplot(kind='bar', filename='fec-2016-top-occupation-volume', yTitle='Number of Contributions',
title="Top 25 Occupations By Number of Contributions")
Which Occupations Donated The Most?
job = df.groupby('contbr_occupation').contb_receipt_amt.sum().sort_values(ascending=False)
job[:25].iplot(kind='bar', filename='fec-2016-top-occupation-amounts', yTitle='Total Contribution',
title="Top 25 Occupations By Contribution Ammount")
Which Occupations Donated The Most to Hillary Clinton?
job = df[df['cand_nm'] == 'Clinton, Hillary Rodham'].groupby('contbr_occupation').contb_receipt_amt.sum().sort_values(ascending=False)
job[:25].iplot(kind='bar', filename='fec-2016-top-occupation-amounts-clinton', yTitle='Total Contribution',
title="Clinton Donors Top 25 Occupations By Contribution Ammount")
Which Occupations Donated The Most to Donald Trump?
job = df[df['cand_nm'] == 'Trump, Donald J.'].groupby('contbr_occupation').contb_receipt_amt.sum().sort_values(ascending=False)
job[:25].iplot(kind='bar', filename='fec-2016-top-occupation-amounts-trump', yTitle='Total Contribution',
title="Trump Donors Top 25 Occupations By Contribution Ammount")
Which State Donated The Most Number Of Times?
state = dff.contbr_st.value_counts().sort_values(ascending=False)[:20]
state.iplot(kind='bar', filename='fec-2016-state-contributions-volume')
Which State Donated The Most?
state_ammt = dff.groupby('contbr_st').contb_receipt_amt.sum().sort_values(ascending=False)[:20]
state_ammt.iplot(kind='bar', filename='fec-2016-state-contributions-ammount')
The Jupyter Notebook for this work can be found on GitHub.