We provide the solution to this challenge using:
Need a program? Click here.
Here we are in the beginning of 2023. The 2022 midterm elections just wrapped up, but the time for the 2024 campaign is rapidly approaching. You feel a strong desire to get involved and help make a difference. You contact a local campaign manager and a candidate who is running for the house of representatives to see how you can get involved. With a background and experience in finance and accounting, you feel qualified to take on the role of campaign finance director and they offer you the position.
As the finance director, you are responsible for raising the money that will allow the campaign to accomplish its goals. As such, you have a meeting later this week with the campaign staff to go over goals to give your candidate the best chance for success. Luckily, you have access to a dataset that shows the relationship between the financing of campaigns and congressional election results. Your plan is to understand and find patterns in the data to help you come prepared for the strategy meeting.
- The campaign data you have access to is dirty and needs to be cleaned. Identify and resolve the inconsistencies in columns can_nam, can_off, can_off_state, and can_par_aff. It will not matter how you resolve the inconsistencies as long as each respective column is uniform. Additionally, replace null values with 0s and remove duplicates.
- Now that the data has been cleaned, it is ready to be analyzed. Limit all of your findings to just the data for the house of representatives candidates. Your future candidate is a business owner and is prepared to spend a lot of his personal money to help the campaign. You want to find if there is a correlation between winning and the percentage of the candidate’s personal funds in the total contribution (candidate contributions/total contributions). You imagine that anytime a candidate contributes 50% or more of the total contributions that this is significant. Find how many candidates fall into this category and how many of them won their election. Determine for yourself if there is a strong correlation and how you would advise the candidate.
- The dataset provides access to the coverage start date and coverage end date for each candidate. Across the board for the house of representatives candidates, calculate the average number of days these candidates were covered in the news. Separate the candidates into two groups, those who had more than the average number of days in coverage and those who had less than the average number of days in coverage. Again, determine for yourself if there is a strong correlation and how you would advise the candidate.
- You understand that campaign spending varies dramatically depending on the area. However, you want to see if there is a connection between raising a lot of money (higher total contributions) and winning. First, calculate the average total contribution for the candidates in each state. Then, find the candidates who had more total contributions than the average for their state respectively. And again, determine for yourself if there is a strong correlation and how you would advise the candidate.
- Now in closing, remember, you are the professional! The campaign manager, candidate, and the rest of the staff will be looking to you for your accounting and finance expertise. Go through the data and data dictionary, come up with an insightful question, and use alteryx to find the information you are looking for. What you find should add to the conversation in this meeting of formulating a strategy.
- In summary, analyze all of the information you have gathered in steps 2 through 5. What insight or advice will you share in your upcoming meeting? Share how you would advise the team in the comments below as well as share what you chose to find in step 5.
Suggestions and Hints
- With the surplus of information stored in this excel sheet, I highly recommend taking some time before starting the challenge to understand the information you have access to by going over the Data Dictionary.
- I imagine there are many ways in which you could clean the columns in step one that are dirty. I used regex to clean columns can_off and can_par_aff. Additionally, I used a data cleansing tool to fix the capitalization for these columns as well as can-nam. Lastly, I used a find replace tool to clean the can_off_sta. I recommend using this website to assist you with state and territory abbreviations.
- In my solution, I copied and pasted this list into a text input tool. For your convenience, I created an excel sheet that you can use instead.