226|ALTERYX – Pay Gap Puzzle

BYU Student Author: @Marta_Ellsworth
Reviewers: @Andrew_Wilson, @Sterling_Lane
Estimated Time to Solve: 30 Minutes

We provide the solution to this challenge using:

  • Alteryx

Need a program? Click here.

Overview
Your consulting firm has been engaged by Orange County, Florida, to undertake a crucial analysis concerning the pay equity between male and female employees within the county’s workforce. In response to this important task, your supervisor has tasked you with preparing the foundational dataset that will serve as the cornerstone for this comprehensive analysis. You’ve chosen Alteryx as your primary tool to accomplish this task because of its power and efficiency for data preparation tasks. Your objective is to meticulously prepare and clean the data, ensuring it is ready for an in-depth analysis of the pay disparities that may exist. This initial step is critical and will set the stage for the subsequent analysis your team will conduct to uncover and address any issues of gender pay inequality that may exist within Orange County’s employment structure.

Instructions

  1. Calculate the total average pay by gender:
    a. Add together the three pay columns (Base_Salary, Overtime_Pay, Longevity_Pay) then find the total average pay by gender.
    b. Expected columns: Gender, Avg_Total_Pay
  2. Analyze pay differences across departments:
    a. Identify the difference in average total pay between male and female employees for each department.
    b. Only consider departments that employ both male and female employees.
    c. Organize the results to showcase departments with the largest pay differences at the top.
    d. Expected columns: Department, M_Avg_Total_Pay, F_Avg_Total_Pay, Difference
  3. Analyze pay differences across divisions:
    a. Identify the difference in average total pay between male and female employees for each division.
    b. Only consider divisions that employ both male and female employees.
    c. Organize the results to showcase divisions with the largest pay differences at the top.
    d. Expected columns: Division, M_Avg_Total_Pay, F_Avg_Total_Pay, Difference

Data Files

Suggestions and Hints
  • Before calculations, clean your dataset by converting blanks in numerical columns to zeros, ensuring accurate addition of the three pay columns.
  • One way to solve this would be to start by segmenting your dataset based on gender. Next, calculate the average pay for each segment. Once you have these averages, reunite the segmented data by department or division.
  • When merging the gender-segmented datasets back together based on department or division, utilize only the ‘Inner Join’ or ‘J’ output of the join tool. This ensures that you’re only including departments or divisions containing both male and female employees.

Solution

Time to Complete: 35 minutes
Rating: Intermediate
I liked this challenge because it showed me that even if it looks long and complicated, it’s actually relatively simple!



Screenshot 2025-03-25 205705

3 Likes

Time to Complete: 45 minutes
Rating: Intermediate

7 Likes

Time to Complete: 45 minutes
Rating: Intermediate!

I really enjoyed this. It was a unique problem that helped me learn a lot!

5 Likes


Intermidiate:45mins

2 Likes

Time: 35min
Rating: Intermediate / Hard
Good challenge, had to get some help from the solution in order to get back on track a one point. Overall a good practice.




1 Like

Time to Complete: 40 Minutes
Difficulty: Intermediate
Good challenge! I was happy to be able to get the solution without referencing the solution (not typically the case for these)




3 Likes

Time: 40 Minutes
Difficulty: Intermediate



1 Like

Time to Complete: 30 Minutes
This was a good challenge! I liked the logic behind using the filter and join tools. It took a little bit for me to figure it out.

1 Like

Time to Complete: 40 minutes
Rating: Intermediate

1 Like

Time to Complete: 35 min
Difficulty: wasnt too bad since I did it in a very repetitious way
Solution:

Time to Complete: 40 min
Difficulty: Intermediate - This helped me learn about data aggregation across variables just like SQL.




Time to complete: 25 minutes
Difficulty: Intermediate

Time to complete 45 mins
Rating: intermediate

Lots of practice with joins

difficulty: Intermediate
Time: 1 Hour
Had to look at the solution once to see what was going on, but then figured it out from there

1 Like

Time: 40 minutes
Difficulty: Intermediate
This was a good challenge! I learned about joins and the summarize function.

1 Like


problem wasnt bad. Pretty sure i got the same answers as everyone else

Time to Complete: 45 minutes
Rating: Intermediate
Definitely took me a bit longer and made me become a little more organized


Screenshot 2025-03-31 171017

2 Likes

Time to complete: Intermediate
Difficulty: 45

Time: 30 Min
Rating: Intermediate