136|ALTERYX – To Spend or Not to Spend

BYU Student Author: @Boston
Reviewers: @Spencer, @Christian
Estimated Time to Solve: 45 Minutes

We provide the solution to this challenge using:

  • Excel
  • Alteryx

Need a program? Click here.

Overview
You have been newly hired as an intern at Rango Incorporated and your boss has approached you with an important task. Your resume boasts adept skills in Excel and Alteryx, so your boss thought you would be perfect to help go through data to answer important questions. However, there are a few problems with the dataset. Within the excel workbook, there are three sheets related to Employee, Trip, and Purpose data. The person who input the employee data used unprofessional capitalization when each row of employee data should appear like: “John Smith, Sales, 000, Miami, Florida.” In the Trip data, the same employee accidentally input the first 68 rows of data into column “H” and did not split up the data into their proper rows or columns. Your boss would like you to organize, upload, and analyze the data within Alteryx to answer three important questions:

  • How many employees spent more than their estimated costs on trips?
  • Is there a correlation between excess spending and the number of complaints for an employee?
  • Which trip purpose had the greatest excess spending?

Instructions

  1. Parse the Trip data using Alteryx into the correct rows and columns.
  2. Clean the Employee data using Alteryx. Your boss does not want to use any rows that have any null or empty values. For example, you can exclude any row of data that has a null employeename, department, employeeid, city, state, or complaints. There are some duplicate employees in the Employee data that need to be combined and their complaints summed together.
  3. Join these tables so that you can do further calculations.
  4. Use Alteryx tools to find out if employees spent more than their estimated costs by creating a new column “difference” that calculates the difference between the average estimated and actual costs.
  5. For question 3, use all available data (regardless of difference amount) to find the total difference for each Purpose ID.
  6. You should finish with two final tables with the following headers:
    Picture2
    Picture1

Data Files

Suggestions and Hints

The Left, Mid, Right functions or delimiter tool may be helpful to parse the Trip data into the correct columns. It may be helpful to remove the whitespace in the Trip data and unwanted columns with the data cleansing tool.

Solution

Check Figures

The “employeename” table should have 9 employees/rows and the “ExpPurposeDesc” table should have 5 rows.

Solution Image

Challenge136_Solution.xlsx
Challenge136_Solution.yxmd
Solution Video: Challenge 136|ALTERYX – To Spend or Not to Spend

This is my solution to this problem. I actually ended with different answers than the solution proposed above. When cleaning the Trip table I found it necessary to parse the columns into their respective rows and then union the top half of the data with the bottom half of the data. By doing so I was able to use data from all of the trips taken throughout the year. By cleaning both halves of the data my numbers came out significantly different.

Overall, I found this challenge very applicable to situations that would be presented in real-world applications. Between cleaning the data and ensuring that all data was accounted for there were many small mistakes that could be made that would produce inaccurate results.

Great challenges!

1 Like

Time to Complete: 50 Minutes
Rating: Intermediate
This was a fun challenge! I found when comparing my answer to the proposed solution that I had quite a bit of differences, mostly in terms of workflow efficiency. First, I generally try to avoid the data cleansing tool since it’s very resource intensive, so I took a Regex parsing approach to the Trip Data table. I found the Multi Field tool very convenient to fix all the issues in the Employee Data tool since you can apply one formula to multiple fields. My solution did line up with the posted solution. Overall, a great refresher to my Alteryx skills!

Solution Image:

Solution File:

3 Likes

Excellent challenge. Made me really have to slow down and figure things out. Took me about 90 minutes. Here was my solution:

5 Likes


Took me hours to complete it, but I loved the process of thinking through the logic and figuring things out!

3 Likes

Time to complete: 25 minutes
Rating: Intermediate
Comments: This was a great practice for the summation tool and REGEX.

Time: 70 minutes
Difficulty: Intermediate


Time 1 hour 15 minutes
Difficulty Intermediate to Advanced
Notes: Learned a lot about the join.

Time to Complete: 40 minutes
Rating: Intermediate

1 Like


Time: 1.5 Hours
Difficulty: Intermediate to Advanced

Learned that it is very important to plan out the flow before actually doing it. It would have saved me a lot more time.