BYU Student Author: @Benjamin_Lau
Reviewers: @Marta_Ellsworth , @Jimmy_Han
Estimated Time to Solve: 30 Minutes
We provide the solution to this challenge using:
Need a program? Click here.
Overview
This challenge aims to test your skills in both data generation using ChatGPT and SQL querying with ChatGPT’s assistance. The challenge is designed to enhance your proficiency in data manipulation and SQL query formulation. It can hopefully help you learn SQL through ChatGPT and understand how powerful this tool can be.
Instructions
- Use ChatGPT to generate a fictitious firm expense dataset with information for transactionID, date, category, description, amount, currency and other relevant information you want to include. Remember to ask ChatGPT to output that in a .csv format.
a. Usually, ChatGPT 3.5 will only generate 10 rows of data but if you continue to ask for more, it will spit out more rows. For example, you can prompt “Can you generate more?” Until you generate as much data as you want. In the solution data, I asked 4 more times to generate 40+ rows. Do NOT close your ChatGPT afterward.
- After generating the dataset, you can copy the entire csv output by pressing “copy code” on the top right of the output box. Paste the dataset onto an Excel worksheet. Then, parse the dataset into various columns. See Hint #2.
- Open Microsoft Access and import the dataset by first creating a blank database. Then, under “External Data”, press “New Data Source” > “From File” > “Excel” to import the newly created Excel dataset.
- Write an SQL to find the transactions with amounts equal to or larger than $1,500 and sort the output by amount descending. Display the columns Date, Category, Description, and Amount for these transactions.
- Ask ChatGPT to check your answer. Paste your SQL and prompt ChatGPT to check if it achieves the purpose. An example prompt can be found under Hint #3. Or you can try it out yourself and experiment with what is best to say when interacting with ChatGPT in order to have ChatGPT check your codes to achieve your desired purpose.
- Write another SQL to find the sum of the transaction amounts for each category. Sort the output by amount descending. Display Category and Sum_Amount for each category. Use ChatGPT to check your answers again.
- Take a screenshot of the heading and first row your dataset. Upload the screenshot and Access database file (.accdb) to this TechHub challenge post. Also, comment on the power of generative AI on database query and answer the question “do we still need to learn SQL given this power of Gen AI?”
Data Files
- You will create your own data file
Suggestions and Hints
- Select all the data including the headings. Go to Data > Text-to-Columns. Check the box for Delimited and click next. Check the box for comma and uncheck all other boxes. Click Next. Set destination as $A$1 and click Finish.
- You can use prompts like, “Can you generate an expense dataset by an AI tech company and output the data in a .csv format with primary keys? Include columns for date, category, description, amount, currency and other relevant information.”
- An example prompt for checking answers can be “Here is my SQL. Would it achieve the purpose of finding the transactions with amounts equal to or larger than 1500 and sort the output by amount descending? And displaying the date, category, description, and amount for these transactions. Assuming I have parsed this dataset into multiple columns in a sheet named Sheet1.”
Solution
Time to complete: 5 minutes
Difficulty: Easy
In this case, gen AI was able to answer the questions correctly. However, as we face more complex challenges, gen AI cannot solve our problems as well as we would like. I believe that we still need to learn SQL.
Select date, category, description, amount from firm_expenses_dataset where amount >= 1500 order by amount desc
select category, sum(amount) amount from firm_expense_dataset group by category order by sum(amount) desc
Time to complete: 30 minutes
Difficulty: Easy

I was unable to upload the access database file because it said it is not a supported file type.
I believe that we absolutely still need to learn SQL with the power of GenAI. Although GenAI is exceptional at writing SQL, it is important to know what types of things that you can create or do with SQL to instruct GenAI effectively. For example, when I uploaded my data to Microsoft Access, it didn’t accept my “Date” columnn, but because I’ve learned SQL in the past, I knew that I could use an “Update” statement in SQL to add dates to the records. Because I knew this was a possibility I was able to ask ChatGPT to help me write a script that would do so. It is also important to understand SQL so that you can take full responsibility for what happens to a dataset under your watch or to take full responsibility in the data analytics that you are performing.
Time to complete: 30 minutes
Difficulty: Easy

It was fascinating to use ChatGPT to help me revise SQL code. Although it took some time to get the code right since I haven’t written SQL in a long time, ChatGPT was incredibly helpful in correcting my mistakes and guiding me through the process. I can definitely see how ChatGPT can be a valuable tool for generating datasets and practicing writing SQL code more effectively. I think it’s still important to know and learn SQL because there are times when you need to audit the process or explain what the code is doing.


Time spent: 1h20
Difficulty: Intermediate
Comment: It was my first time playing with SQL so it took me a long time to just download the necessary servers and developer. Thanks to the help of a friend, I was able to set it up and start the assignment. Regarding the actual coding, chatgpt was surprisingly efficient in coming up with codes that worked first try. The codes also look very similar to python or VBA codes.
Time to complete: 60 minutes
Difficulty: Beginner
I was unable to upload a file of my Access database file because the file type is not authorized.
It is clear that generative AI is a powerful tool for creating, revising, and reviewing database queries. However, I do think it is still important to learn SQL to understand the scope, capabilities, and limitations of SQL. There are also times when SQL created by generative AI won’t perform the way you need, so knowing the language is important to make adjustments along the way.

Time to complete: 45 minutes
Difficulty: Easy
I also could not get the file to upload.
I think that we will definitely still need to learn SQL on our own. GenAI just gives us the opportunity to be experts at many different programming platforms. If you can learn just enought about SQL that you understand what ChatGPT is giving you for code, you will be able to edit it for your needs.
Time to complete: 40 mintues
I think we need to learn more about SQL, I would finish this challenge faster if I know more about field settings of the SQL
I can’t uplad a accd file on this reply
Time to Complete: 10 minutes
Difficulty: Easy
Notes: Similar to what other people have commented on, AI has a place in learning, writing, and implementing SQL. In order to leverage AI effectively in more complex problems we need to understand SQL.

Query 1: SELECT Date, Category, Description, Amount
FROM firm_expenses
WHERE Amount >= 1500
ORDER BY Amount DESC;
Query 2: SELECT f.Category, SUM(f.Amount)
FROM firm_expenses f
GROUP BY f.Category;