28|SQL – Big Money Budget

BYU Student Author: @Alex_Garrett
Reviewers: @Marco, @DylanKing, @Spencer
Estimated Time to Solve: 40 Minutes

We provide the solution to this challenge using:

  • SQL (Microsoft Access)

Need a program? Click here.

Overview
PANIC! COVID-19 has been difficult for the audit firm because so much of your work used to be done at the client site. Additionally, travel restrictions and client withdrawal have made it even more challenging. Your manager is unsure of the firm’s future if COVID continues to disrupt its operations.

The budget for 2020 travel expenditures has been adjusted to reflect these changes, but your manager suspects that employees have been using the firm’s funds to vacation while “working from home.” While it is not the end of the year, you are tasked to investigate budget discrepancies in order to identify areas that require further investigation. Your manager shares with you the “Expense Database” which has expense data since the firm’s inception. Read the data dictionary to understand how the database is organized before running queries.

Instructions
This challenge focuses on manually typing SQL statements to retrieve the desired data. Query Design can also be used to solve this challenge, but it’s extremely helpful to be able to read SQL statements and know what is happening. I recommend manually typing the SQL statements before using Query Design. Click here for a quick video on how to get your SQL statements started.

  1. Show employees who have the most missing receipts (include all years), but show only employees who are missing more than 5 receipts. Use the following headers for your output: “EmpName” and “MissingReceipts”. Show the employee with the most missing receipts first.
  2. Show employees who underestimated their total costs for trips that began in 2020. Use the following headers for your output: “EmpName”, “TotalEstCosts”, “TotalActCosts”, and “Difference” (configure the “Difference” column so that an underestimated budget results in a negative difference). Round the difference column to 2 decimal places, and show the employee with the greatest difference first.
  3. Show employees who spent more than their budget in 2020. Include all trips that began in 2020. Use the following headers for your output: “EmpName”, “BudgetAmount”, “Expenses”, and “Difference”. Round the difference column to 2 decimal places, and show the employee with the greatest difference first.
  4. Show the company-wide budget for the past three years (2018-2020), and show how much money the company over- or under-spent for each of those years. Use the start date of trips to determine in what year we account for them. Use the following headers for your output: “BudgetYear”, “TotalBudget”, “TotalExpense”, and “Difference”. Round the expense and the difference columns to 2 decimal places and order this table by “BudgetYear” ascending.

Data Files

Suggestions and Hints

The UNION operator is a less-known tool, but it can be very useful in completing this challenge.

Solution

I loved this SQL challenge! The UNION was a nice little complex wrinkle in the problem. I was reminded of the importance of the syntax in using SQL queries. Here is what I got!

6 Likes

Flashback to the Junior Core! Fun to refresh my SQL skills. I didn’t use a UNION for the last question… instead I created two separate tables (one for budget and one for actuals) then joined them together on Year. Here’s my code:

SELECT bb.BudgetYear, bb.TotalBudget, ee.TotalExpense, ROUND(bb.TotalBudget-ee.TotalExpense, 2) as Difference
FROM

(SELECT b.BudgetYear, SUM(b.BudgetAmount) AS TotalBudget
FROM Budget AS b
WHERE b.BudgetYear IN (2018, 2019, 2020)
GROUP BY b.BudgetYear
ORDER BY b.BudgetYear ASC
) as bb

INNER JOIN 

(SELECT YEAR(t.StartDate) as BudgetYear, SUM(td.Amount) as TotalExpense
FROM TripDetail as td
LEFT JOIN Trip as t
ON td.TripID = t.TripID
WHERE YEAR(t.StartDate) IN (2018, 2019, 2020)
GROUP BY YEAR(t.StartDate)
ORDER BY Year(t.StartDate) ASC
) as ee

ON bb.BudgetYear = ee.BudgetYear
;
6 Likes

Here are my results for the four problems:
1)
select empname, Count as MissingReceipts from (
SELECT t.empid, td.receiptincluded, e.empname, count(t.tripid) as Count from ((trip t
inner join tripdetail td on td.tripid=t.tripid)
inner join employee e on t.empid=e.empid)
where td.receiptincluded= “N”
group by t.empid, td.receiptincluded, e.empname
order by count(t.tripid) desc)
where Count>5
;

select e.empname, e.empid, Estimatedcosts as TotalEstCosts,ExpenseAMount as TotalActCosts, Difference from employee e
inner join
(
select t.empid, sum(t.estimatedcosts) as Estimatedcosts, sum(td.ExpAmt) as ExpenseAMount, sum(t.estimatedcosts)- sum(td.ExpAmt) as Difference from trip t
inner join

(select sum(amount) as ExpAmt, tripid from tripdetail
group by tripid) td

on t.tripid=td.tripid
where year(t.startdate)=2020
group by t.empid
) dt
on e.empid=dt.empid
where Estimatedcosts<ExpenseAMount
order by difference

select * from (
select empname, b.budgetamount, sum(td.amount) as Expenses, sum(amount)-b.budgetamount as Difference from (((trip t
inner join employee e on t.empid=e.empid)
inner join budget b on b.empid=e.empid)
inner join tripdetail td on td.tripid=t.tripid)
where year(startdate)=2020
and budgetyear=2020
group by empname, b.budgetyear, b.budgetamount
)
where budgetamount <Expenses
order by Difference desc

select distinct b.budgetyear, round(yrexp.exp,2) as TotalExpense, BA as TotalAmount, round(yrexp.exp-BA,2) as Difference from

(select sum(budgetamount) as BA, budgetyear from budget
group by budgetyear) b

inner join
(
SELECT year(t.startdate) as year, sum(amount) as exp from trip t
inner join tripdetail td on td.tripid=t.tripid
group by year(t.startdate)
) yrexp
on b.budgetyear=yrexp.year
where budgetyear in (2018,2019,2020)
order by budgetyear
;
Query1


Query4

5 Likes

@Alex_Garrett
I believe that your query and subsequent results for query 2 may be incorrect. In the solution query that was posted, the join of the trip detail and trip table cause the trip data to join as many times as there is a detail record and then sum the estimated costs. So for example, employee 1581 had only one trip starting in the year 2020, but has 5 trip detail rows. Therefore, when you sum the estimated costs for the single trip id, it multiples the single estimated cost by 5, rather than retaining the single estimated cost for single trip.
I may be mistaken, so please feel free to push back! Thanks!

1 Like

Great practice! My solutions for queries 1-3 were very similar to the solution given, but I did my 4th query a bit differently:

SELECT B.BudgetYear, B.TotalBudget, TD.TotalExpense, ROUND((B.TotalBudget - TD.TotalExpense), 2) AS Difference
FROM (SELECT b.BudgetYear, SUM(b.BudgetAmount) AS TotalBudget
FROM Budget AS b
WHERE b.BudgetYear BETWEEN 2018 AND 2020
GROUP BY b.BudgetYear) AS B INNER JOIN (SELECT YEAR(t.StartDate) AS BudgetYear, SUM(td.Amount) AS TotalExpense
FROM Trip AS t INNER JOIN TripDetail AS td ON t.TripID=td.TripID
WHERE YEAR(t.StartDate) BETWEEN 2018 AND 2020
GROUP BY YEAR(t.StartDate)) AS TD ON B.BudgetYear=TD.BudgetYear
GROUP BY B.BudgetYear, B.TotalBudget, TD.TotalExpense
ORDER BY B.BudgetYear;

Here are the results from all of my queries as well:

Query 1



Query 4

4 Likes

This was a great challenge! I loved learning how to import the access database, so I could complete the challenge in SQL Server.

My solution code in T-SQL for SQL server is here:

--Query 1
SELECT
    EmpName
    , COUNT(ReceiptIncluded) AS MissingReceipts
FROM
    dbo.Employee e
    JOIN dbo.Trip t ON e.EmpID = t.EmpID
    JOIN dbo.TripDetail td ON t.TripID = td.TripID
WHERE ReceiptIncluded = 'N'
GROUP BY EmpName
HAVING COUNT(ReceiptIncluded) > 5
ORDER BY COUNT(ReceiptIncluded) DESC
;

--Query 2
SELECT
    e.EmpName
    , CAST(SUM(t.EstimatedCosts) AS DECIMAL(7,2)) AS TotalEstCosts
    , CAST(SUM(td.Amount) AS DECIMAL(7,2)) AS TotalActCosts
    , CAST(SUM(t.EstimatedCosts) - SUM(td.Amount) AS DECIMAL(7,2)) AS Difference
FROM 
    dbo.Employee e
    JOIN dbo.Trip t ON e.EmpID = t.EmpID
    JOIN dbo.TripDetail td ON t.TripID = td.TripID
WHERE 
    YEAR(t.StartDate) = 2020
GROUP BY e.EmpName
HAVING SUM(t.EstimatedCosts) - SUM(td.Amount) < 0
ORDER BY SUM(t.EstimatedCosts) - SUM(td.Amount)
;

--Query 3
SELECT
    e.EmpName
    , CAST(AVG(b.BudgetAmount) AS DECIMAL(7,2)) AS BudgetAmount
    , CAST(SUM(td.Amount) AS DECIMAL(7,2)) AS TotalActCosts
    , CAST(AVG(b.BudgetAmount) - SUM(td.Amount) AS DECIMAL(7,2)) AS Difference
FROM 
    dbo.Budget b
    JOIN dbo.Employee e ON b.EmpID = e.EmpID    
    JOIN dbo.Trip t ON e.EmpID = t.EmpID
    JOIN dbo.TripDetail td ON t.TripID = td.TripID
WHERE
    b.BudgetYear = 2020
    AND YEAR(t.StartDate) = 2020
GROUP BY e.EmpName
HAVING AVG(b.BudgetAmount) - SUM(td.Amount) < 0
ORDER BY AVG(b.BudgetAmount) - SUM(td.Amount)
;

--Query 4
SELECT
    b.BudgetYear
    , CAST(b.BudgetAmount AS DECIMAL(10,2)) AS TotalBudget
    , CAST(SUM(td.Amount) AS DECIMAL(10,2)) AS TotalExpense
    , CAST(AVG(b.BudgetAmount) - SUM(td.Amount) AS DECIMAL(10,2)) AS Difference
FROM 
    ( SELECT BudgetYear, SUM(BudgetAmount) AS BudgetAmount
        FROM dbo.Budget 
        GROUP BY BudgetYear) b
    JOIN dbo.Trip t ON b.BudgetYear = YEAR(t.StartDate)
    JOIN dbo.TripDetail td ON t.TripID = td.TripID
WHERE b.BudgetYear IN (2018,2019,2020)
GROUP BY b.BudgetYear, b.BudgetAmount
ORDER BY b.BudgetYear
;
2 Likes

Looks like a fun challenge!

Since I’m a novice to SQL, this was a challenge, especially the last query, but this was really good! My queries are below:

SELECT e.EmpName, COUNT(td.ReceiptIncluded) AS MissingReceipts
FROM (Employee AS e INNER JOIN Trip AS t ON e.EmpID = t.EmpID) INNER JOIN TripDetail AS td ON t.TripID = td.TripID
WHERE td.ReceiptIncluded = “N”
GROUP BY e.EmpName
HAVING COUNT(td.ReceiptIncluded) > 5
ORDER BY COUNT(td.ReceiptIncluded) DESC;

SELECT e.EmpName, SUM(t.EstimatedCosts) AS TotalEstCosts, SUM(td.Amount) AS TotalActCosts, ROUND(SUM(t.EstimatedCosts) - SUM(td.Amount), 2) AS Difference
FROM (Employee AS e INNER JOIN Trip AS t ON e.EmpID=t.EmpID) INNER JOIN TripDetail AS td ON t.TripID=td.TripID
WHERE Year(t.StartDate) = 2020
GROUP BY e.EmpName
HAVING SUM(t.EstimatedCosts) - SUM(td.Amount) < 0
ORDER BY SUM(t.EstimatedCosts) - SUM(td.Amount);

SELECT e.EmpName, b.BudgetAmount, SUM(td.Amount) AS Expenses, ROUND(b.BudgetAmount - SUM(td.Amount), 2) AS Difference
FROM ((Employee AS e INNER JOIN Budget AS b ON e.EmpID=b.EmpID) INNER JOIN Trip AS t ON e.EmpID=t.EmpID) INNER JOIN TripDetail AS td ON t.TripID=td.TripID
WHERE b.BudgetYear = 2020 AND Year(t.StartDate) = 2020
GROUP BY e.EmpName, b.BudgetAmount
HAVING b.BudgetAmount - SUM(td.Amount) < 0
ORDER BY ROUND(b.BudgetAmount - SUM(td.Amount), 2);

SELECT u.BudgetYear, SUM(u.Budget) AS TotalBudget, SUM(u.Expenses) AS TotalExpenses, SUM(u.Budget) -SUM(u.Expenses) AS Difference
FROM (SELECT b.BudgetYear, SUM(b.BudgetAmount) AS Budget, 0 AS Expenses
FROM Budget b
WHERE b.BudgetYear BETWEEN 2018 AND 2020
GROUP BY b.BudgetYear
UNION
SELECT YEAR(t.StartDate) AS BudgetYear, 0 AS Budget, ROUND(SUM(td.Amount),2) AS Expenses
FROM Trip t INNER JOIN TripDetail td ON t.TripID=td.TripID
WHERE YEAR(t.StartDate) BETWEEN 2018 AND 2020
GROUP BY YEAR(t.StartDate)) AS u
GROUP BY u.BudgetYear;

1 Like

I’m pretty slow with SQL but this was a good opportunity to practice.

-Query 1
SELECT EmpName , COUNT(ReceiptIncluded) AS MissingReceipts FROM dbo.Employee e JOIN dbo.Trip t ON e.EmpID = t.EmpID JOIN dbo.TripDetail td ON t.TripID = td.TripID WHERE ReceiptIncluded = ‘N’ GROUP BY EmpName HAVING COUNT(ReceiptIncluded) > 5 ORDER BY COUNT(ReceiptIncluded) DESC ;

–Query 2
SELECT e.EmpName , CAST(SUM(t.EstimatedCosts) AS DECIMAL(7,2)) AS TotalEstCosts , CAST(SUM(td.Amount) AS DECIMAL(7,2)) AS TotalActCosts , CAST(SUM(t.EstimatedCosts) - SUM(td.Amount) AS DECIMAL(7,2)) AS Difference FROM dbo.Employee e JOIN dbo.Trip t ON e.EmpID = t.EmpID JOIN dbo.TripDetail td ON t.TripID = td.TripID WHERE YEAR(t.StartDate) = 2020 GROUP BY e.EmpName HAVING SUM(t.EstimatedCosts) - SUM(td.Amount) < 0 ORDER BY SUM(t.EstimatedCosts) - SUM(td.Amount) ;

–Query 3
SELECT e.EmpName , CAST(AVG(b.BudgetAmount) AS DECIMAL(7,2)) AS BudgetAmount , CAST(SUM(td.Amount) AS DECIMAL(7,2)) AS TotalActCosts , CAST(AVG(b.BudgetAmount) - SUM(td.Amount) AS DECIMAL(7,2)) AS Difference FROM dbo.Budget b JOIN dbo.Employee e ON b.EmpID = e.EmpID JOIN dbo.Trip t ON e.EmpID = t.EmpID JOIN dbo.TripDetail td ON t.TripID = td.TripID WHERE b.BudgetYear = 2020 AND YEAR(t.StartDate) = 2020 GROUP BY e.EmpName HAVING AVG(b.BudgetAmount) - SUM(td.Amount) < 0 ORDER BY AVG(b.BudgetAmount) - SUM(td.Amount)

–Query 4
SELECT b.BudgetYear , CAST(b.BudgetAmount AS DECIMAL(10,2)) AS TotalBudget , CAST(SUM(td.Amount) AS DECIMAL(10,2)) AS TotalExpense , CAST(AVG(b.BudgetAmount) - SUM(td.Amount) AS DECIMAL(10,2)) AS Difference FROM ( SELECT BudgetYear, SUM(BudgetAmount) AS BudgetAmount FROM dbo.Budget GROUP BY BudgetYear) b JOIN dbo.Trip t ON b.BudgetYear = YEAR(t.StartDate) JOIN dbo.TripDetail td ON t.TripID = td.TripID WHERE b.BudgetYear IN (2018,2019,2020) GROUP BY b.BudgetYear, b.BudgetAmount ORDER BY b.BudgetYear ;

Good challenge, I had to look up some things and ask about some others learned a lot from it! I decide to also sort them in a way that I though they were better to look at going always from big to small either difference or in case or query 1 by the amount of missing receipts as you can see in the highlighted columns of the screenshots.
Screenshot 2023-10-09 022345



Screenshot 2023-10-09 022634

This was a great practice problem as we are learning about these problems currently in the junior core. The first two problems were also great reviews from my IS 201 class.

image
image
image
image

Today in class we discussed the different between using WHERE and HAVING. It was fun to practice that during this challenge. Once again, thanks for the helpful solution video as well to explain some confusing parts of the 4th Query. Here is my SQL for that one: