BYU Student Author: @Trent_Barlow
Reviewers: @Carter_Lee
Estimated Time to Solve: 20 Minutes
We provide the solution to this challenge using:
- Python
Need a program? Click here.
Overview
You have recently been hired as the CFO of a midsized company that sells widgets to other businesses. Unfortunately, the company grew quickly, and until you came aboard, did not have a CPA on the staff. You have looked over the books and they are a mess. You don’t have time to go through every transaction with all your other responsibilities, and the company will not have interns until next summer, so you decide to enlist AI and Python in your battle with the books. Your goals is to create some sort of semi-replicable process to get transactions cleaned up and posted in the right format. You take a sample of transactions from the first few months of 2023 and get right to work.
Instructions
- Create a new Jupyter Notebook file. Take the Dirty_Transactions CSV file and clean it up using AI. Be sure to remove duplicates and fill in missing values based off the other transactions. Chat GPT 4 is highly recommended for this step, or another tool that can take in and analyze a CSV file. If you don’t have access such a tool, please skip this step and download Cleaned_Transactions.csv.
- Using your AI-cleaned CSV or Cleaned_Transactions.csv, read in the data as a new Dataframe to your Jupyter Notebook file, parsing dates, sorting by date and assigning a new index.
- Because your company signs long-term contracts and delivers goods over a period of time, you are lucky enough to know all your profit percentages by customer. You have this information in Profits.CSV. Read in this information and create a new column in the DataFrame created above called Profit using this information. Then, make a new DataFrame with just the Date column, to use later, and drop the Date column.
- You have some Example_Transactions you can look at to determine a pattern as to how to structure your new DataFrame. Add the columns Debit_Amt1, Credit_Amt1, Debit_Amt2, Credit_Amt2 to the DataFrame with the appropriate values.
- Next read in the Example_Transactions.csv and manipulate the data in it as well as the main DataFrame you have been using, in order to merge the two together to get the correct values for Debit1, Credit1, Debit2, and Credit2. This may require some ingenuity (and a dash of AI)! Hint: the sales transactions are the only ones that have 2 journal entries. They are also the only ones wherein the Memo is different depending on what company they are selling to.
- Merge the Date back into the DataFrame, then take only the columns Date, Memo and Debit and Credit and their respective Amt columns to create your final DataFrame.
- Write all of the journal entries to an Excel workbook called “Journal Entries”. The format should be the same as below:
Data Files
- Challenge196_Dirty_Transactions.csv
- Challenge196_Cleaned_Transactions.csv
- Challenge196_Profits.csv
- Challenge196_Example_Data.csv
Solution