61|EXCEL – Determining Cost Drivers with Regression

BYU Student Author: @Parker_Sherwood
Reviewers: @Jonathan_Weston, @Erick_Sizilio
Estimated Time to Solve: 15 minutes

We provide the solution to this challenge using:

Need a program? Click here.

This challenge requires the data analysis ToolPak. To learn how to install it, click here.

You work at Park’s Production, a simple manufacturing facility that makes one product. The facility has data on its first 500 production runs. The company doesn’t have much experience with managerial accounting, so they took a chance and hired you since you’ve had a class or two about the subject. At your most recent meeting with the production manager, you learned that the company is planning to allocate overhead based on headcount (at a rate of $1,000/head). You know that good cost drivers directly contribute to the cost being allocated. For example, headcount may be a good driver for a supervisor’s salary since supervisors are directly responsible for the people they oversee. Even though you’re new, you have your doubts about headcount being a good driver for all overhead costs such as depreciation, rent, supervisor salaries, etc.

You have some experience with the data analysis ToolPak and regression analysis and are ready to put it to use. You know that something is statistically significant if the p-value is less than 0.05. You also know that “R Square” in the regression output denotes a high correlation for values between +/- 0.5 and +/- 1. Values between +/- 0.29 and 0 denote a weak or nonexistent relationship. The best drivers, as a result, will show a “R Square” above 0.5 (since we expect a positive relationship between a driver and its cost).

It’s time to find out just how good headcount is as a cost driver—and if there’s a better option.


  1. Take a look at the rate determined in cell L1. Do you think this is the best way to determine the rate using headcount as a driver? Try another calculation (use columns H, I, J as needed) to see if there’s an argument that $1,000 is even the right number if using headcount.
    Suggestions and Hints

    Calculate the rate of manufacturing overhead/headcount for each production run, then average those. Is it the same as the initial estimate?

  2. Now run a regression analysis on headcount (X variable) and manufacturing overhead (Y variable). Output the results to a new worksheet (see suggestions and hints for what the regression inputs should look like). What do you learn from the p-value? (Cell E18) What do you learn from the R Square? (Cell B5)
    Suggestions and Hints

    Once you open the analysis ToolPak, choose “Regression”. Be sure to include labels (check the checkbox) because that increases readability.

  3. Now run a regression on direct labor hours (X variable) and direct labor (Y variable). Are the results what you would expect?
    Suggestions and Hints

    You should notice that direct labor hours and direct labor costs are perfectly correlated. This makes sense for a place that pays hourly wages.

  4. Try at least on more regression to see if there’s a variable that is better than headcount at predicting manufacturing overhead costs. Put your R Square in the comments!
    Suggestions and Hints

    Try machine hours and manufacturing overhead.

  5. Bonus: Create a column to estimate the overhead expense for each production run. You’ll need the intercept coefficient (B17 of the output), the x variable coefficient (B18 of the output), and the driver for each production run. What else could you do to impress the production manager?

Data Files


I used the machine hours on my regression table and the r coefficient was .91 so it is a better indicator than head count. The file was too big so I attached a couple screenshots


Challenge61_Data_JamesGerstner.xlsx (58.0 KB)
Machine Hours was a much better predictor, with an r-squared value of .913

Challenge61_Data_HallieAnderson_0204.xlsx (53.9 KB)

R squared is 0.913564068 therefore Machine Hours and MOH would be the best Cost Drivers according to our regression analysis.

  1. 1138.55
  2. R square is 0.013 so there is a weak or nonexistent relationship between headcount and manufacturing overhead.
  3. R square is 1 so there is a high correlation between direct labor and direct labor hours.
  4. I chose machine hours and manufacturing overhead. R square is 0.913 so there is a high correlation between machine hours and manufacturing overhead.

Challenge 61.xlsx (43.6 KB)

1 Like

0201_Zullo_Excel61.xlsx (43.1 KB)

Challenge61_Data.xlsx (101.9 KB)

I used the values of machine hours and manufacturing overhead which had an R square value of 0.91.

My file was too big so here are a few screenshots.


Scott_Evan_Challenge61_Data.xlsx (53.3 KB)

1 Like

I chose machine hours as the best estimate of overhead expenses.

This makes sense both from a logical standpoint as well as supporting that idea with the regression. There is a high correlation between machine hours and overhead unlike the number of employees.

BT-Challenge61_Data.xlsx (33.8 KB)

0202_Fischer_HW2Solution.xlsx (41.6 KB)

  1. The average is 1,138.55 so I think the 1,000 is a decent estimate for the proposed manufacturing overhead rate.
  2. From the small p-value, I conclude the hypothesis is statistically significant and I would conclude from the R Square that there is not much of a connection between the manufacturing overhead and headcount because the R Square is so small (0.0126).
  3. Yes, the results are perfectly correlated which makes sense because the direct labor hours and direct labor itself should be highly connected with each other.
  4. Yes, I did machine hours and manufacturing overhead and found the R Square to be 0.9135 which is higher than the previous R Square of 0.0126.

After doing some quick math analyses, I learned that machine hours (even without a regression analysis) appears to share the closest connection with manufacturing overhead. After doing a regression, I found that it very closely correlates to manufacturing overhead, much more than any of the other potential drivers.
Challenge61_Data_Jeffrey_Winters.xlsx (69.6 KB)

0205_Moore_Challenge61.xlsx (41.5 KB)

My R-square value is 0.9133 for “Machine Hours” and “Manufacturing Overhead”. While the p-value was incredibly small on all of the regressions (meaning the observed relationship is statistically significant), it does not attest to the practical significance of the results. The high R-square value indicates that the model using machine hours is a better fit.

1 Like

Machine hours were the best fit for allocating manufacturing overhead as it is closely correlated to overhead cost. I got an adjusted r squared of 0.91339 for machine hours.

0611_Doles_challenge62.xlsx (33.8 KB)

1 Like

In using various data analysis tools in excel, I came to the conclusion that machine hours best connection withe manufacturing overhead.

All other options did not yield strong enough correlations to be deemed as a better driver.

Schoepf, Samuel - Challenge Data Solutions.xlsx (48.4 KB)

1 Like

0301_Taylor_Challenge61_Data.xlsx (89.9 KB)
Machine Hours is the best cost driver with MOH with an R square value of .913. The R square value between headcount is much lower than the relationship between Machine Hours and MOH. I was also able to create a graph showing a fairly linear relationship between Machine Hours and MOH.


I also used machine hours and got a good r square value of .91.
Challenge61_Data.xlsx (50.5 KB)

R square: 0.913318508
0306_Ashcroft_Challenge61_Data.xlsx (43.2 KB)

1 Like

Pieper Nelson Challenge 61.xlsx (56.0 KB)
My R-squared for MOH and machine hours was 0.91 indicating a strong correlation between the two.

0608_Lee_Challenge61_Data.xlsx (35.0 KB)
Machine Hours and Manufacturing Overhead would be the best cost drivers since the R squared is 0.913564.