BYU Student Author: @Christian
Reviewers: @Spencer, @Brett_Lowe @Boston
Estimated Time to Solve: 20 Minutes
We provide the solution to this challenge using:
Need a program? Click here.
Working as a financial analyst for NY Goliath, Inc., you are assigned an overnight project of getting 2 lists: 1) The 25 most common customers that are ordering from us and, 2) the top 10 employees that make the most sales. However, when you get the data, there is an error in converting the information you needed and, consequentially, each row of information has been combined into the same cell for the two master files! Thankfully, the transaction file generated correctly into separate columns. Your boss has requested you generate the customer list and the employee list under the names “Top25Customers” and “Top10Employees”, respectively.
- Parse the master files “Customers” and “Employees” with the appropriate information entered into each column
- After parsing the needed files, Create 2 tables with the following columns:
- Table 1: Cust_Num, Comp_Name, First_Name, Last_Name, Phone_Number, Address, City, State, and NumOfSOs. Table 2: EmpID, First_Name, Last_Name, Phone_Number, Job_Title, Department, and Numof SOs.
- After creating the two tables, find the Top 25 Customers by number (count) of SOs and Top 10 Employees by number (count) of SOs. Save each of these two findings to separate excel sheets titled “Top25Customers” and “Top10Employees”.
Suggestions and Hints
Using Text-to-columns, split each master files into their specific columns. And then begin to summarize the information into the appropriate tables.
Fun challenge! My customer list looked different from yours because I didn’t sort by company name after sorting by “NumOfSOs.” I wonder if there is a way to return the top 25 customers and all additional customers if their SO count is equal to the count in position 25. In this case, some customers with 11 SOs were cut off but I wonder if we can figure out a way to include all of them. You could filter the results to return customers with a count equal to or greater than 11, but then that value is hard coded in. Ideas anyone?
I had the same question. There is probably a more efficient way to solve for this, but this is what I came up with. It looks like there are about 41 customers that have at least 11 SOs, which was the number of SOs for the customer in the 25th row when sorted by “NumOfSOs” after the join.
Great Challenge! I didn’t use the sort method before I joined, because I figured it would all be filtered through later during the summarize tab when I made the two tables. But other than that my solution looks basically the same!
Great challenge! Always a lot of fun when I get to practice cleaning data and joins. I did nearly the same thing as you but I used the sample tool rather than the select records tool that you used. Here’s my solution:
Great weekly challenge! Fun way to practice data cleaning with RegEx.
This was a great challenge to refresh some of my Alteryx data cleansing skills!
I really liked this challenge! It was a good reminder on data cleansing and joins from a transaction table. The only thing I would change is that the top 25 customers will exclude some customers that had the same amount of sales as someone within the top 25. I summarized the transaction table before I joined it with the customer and employee tables
Goliath Cleaning Challenge.yxmd (25.8 KB)
That was a great challenge!
Definitely ran into a few hiccups, hence the long solution, but it was nice to brush up on my alteryx skills.
My workflow looked very similar to the solution. I never used the Select Records tool to get just the top number of records which was a quick solution. If I was to do it over again and to ensure there isn’t a tie between the 25th and 26th record I may use a formula to assign ranks and filter that way instead. Great challenge!
Fun challenge! I was able to learn new tools that I am sure will help me more in class
GoliathCleaning_BenOehler.csv (26.4 KB)
This one was tricky but once I got my data types to match up I was able to figure it out.
This is the hardest one i have done but I learned some valuable lessons
This one took me more than 20 minutes but it was good to stretch my learning