114|ALTERYX – Campaign Captain

BYU Student Author: @DylanKing
Reviewers: @Brett_Lowe, @Boston
Estimated Time to Solve: 60 Minutes

We provide the solution to this challenge using:

  • Alteryx

Need a program? Click here.

Overview
Here we are in the beginning of 2023. The 2022 midterm elections just wrapped up, but the time for the 2024 campaign is rapidly approaching. You feel a strong desire to get involved and help make a difference. You contact a local campaign manager and a candidate who is running for the house of representatives to see how you can get involved. With a background and experience in finance and accounting, you feel qualified to take on the role of campaign finance director and they offer you the position.

As the finance director, you are responsible for raising the money that will allow the campaign to accomplish its goals. As such, you have a meeting later this week with the campaign staff to go over goals to give your candidate the best chance for success. Luckily, you have access to a dataset that shows the relationship between the financing of campaigns and congressional election results. Your plan is to understand and find patterns in the data to help you come prepared for the strategy meeting.

Instructions

  1. The campaign data you have access to is dirty and needs to be cleaned. Identify and resolve the inconsistencies in columns can_nam, can_off, can_off_state, and can_par_aff. It will not matter how you resolve the inconsistencies as long as each respective column is uniform. Additionally, replace null values with 0s and remove duplicates.
  2. Now that the data has been cleaned, it is ready to be analyzed. Limit all of your findings to just the data for the house of representatives candidates. Your future candidate is a business owner and is prepared to spend a lot of his personal money to help the campaign. You want to find if there is a correlation between winning and the percentage of the candidate’s personal funds in the total contribution (candidate contributions/total contributions). You imagine that anytime a candidate contributes 50% or more of the total contributions that this is significant. Find how many candidates fall into this category and how many of them won their election. Determine for yourself if there is a strong correlation and how you would advise the candidate.
  3. The dataset provides access to the coverage start date and coverage end date for each candidate. Across the board for the house of representatives candidates, calculate the average number of days these candidates were covered in the news. Separate the candidates into two groups, those who had more than the average number of days in coverage and those who had less than the average number of days in coverage. Again, determine for yourself if there is a strong correlation and how you would advise the candidate.
  4. You understand that campaign spending varies dramatically depending on the area. However, you want to see if there is a connection between raising a lot of money (higher total contributions) and winning. First, calculate the average total contribution for the candidates in each state. Then, find the candidates who had more total contributions than the average for their state respectively. And again, determine for yourself if there is a strong correlation and how you would advise the candidate.
  5. Now in closing, remember, you are the professional! The campaign manager, candidate, and the rest of the staff will be looking to you for your accounting and finance expertise. Go through the data and data dictionary, come up with an insightful question, and use alteryx to find the information you are looking for. What you find should add to the conversation in this meeting of formulating a strategy.
  6. In summary, analyze all of the information you have gathered in steps 2 through 5. What insight or advice will you share in your upcoming meeting? Share how you would advise the team in the comments below as well as share what you chose to find in step 5.

Data Files

Suggestions and Hints
  • With the surplus of information stored in this excel sheet, I highly recommend taking some time before starting the challenge to understand the information you have access to by going over the Data Dictionary.
  • I imagine there are many ways in which you could clean the columns in step one that are dirty. I used regex to clean columns can_off and can_par_aff. Additionally, I used a data cleansing tool to fix the capitalization for these columns as well as can-nam. Lastly, I used a find replace tool to clean the can_off_sta. I recommend using this website to assist you with state and territory abbreviations.
  • In my solution, I copied and pasted this list into a text input tool. For your convenience, I created an excel sheet that you can use instead.

Solution

This felt like a good refresher for Alteryx for me. While it felt like a lot of activities, I found there was a lot more copy and paste than I initially theorized. My results from the questions asked are below:

  1. I did not find a strong correlation between personal money being spent and winning, with about 2% of candidates in this pool winning their election. I believe the reason for this is due to scoping. For example, a candidate contributing $2,500 to a campaign with only $4,000 total donations gets a higher percentage than a candidate donating $1,000,000 to a campaign with $4,000,000. I would encourage my client to abandon this theory and focus wholly on soliciting funds from the general public.
  2. News Coverage did have a major effect with 54% of candidates averaging greater coverage in the news winning their elections compared to 4% of candidates averaging less coverage. Seems like coverage can be a winning strategy, and I would advise my client to pitch their candidate to various news sites for interviews and articles to spread the word to as many people as possible.
  3. There is also a strong correlation between raising money and winning state by state. I found that 80% of candidates who raised more money than the state average ended up winning their elections. I would advise my client to spend more of their budget soliciting for donations as that is a proven indicator of political success.
  4. I decided to analyze the impact of having to issue refunds. Does issuing refunds severely impact a candidates ability to win? To do this, I compared correlations of total contributions to net contributions and ultimately found no strong correlation. So therefore, don’t worry about having the strongest accounting system in the world!
  5. My main insight from my findings is that the people matter almost more than the candidate. Focusing on the people in the candidate’s area will yield the most success. Focus on getting to know the people, face to face and over the news media, and you will most likely see success. If donations are being spent on getting the candidate out to as many people as possible, that will be the best strategy for success.

The Alteryx itself was not too challenging once you got your mind wrapped around the data. I found that the Find and Replace tool was a real hero for synchronizing and cleaning the data while the Append Fields and Join Multiple tool was also helpful in a few places. Fun challenge! See my result and overall Workflow below.

Correlation Summary:

Overall Workflow:

Workflow Packaged File:
Campaign Captain.yxzp (468.2 KB)