BYU Student Author: @Jonathan_Weston
Reviewers: @Hyrum, @Parker_Sherwood
Estimated Time to Solve: 45 Minutes
We provide the solution to this challenge using:
- Python
Need a program? Click here.
Overview
You are a tax accountant at a small office who prepares tax forms for individuals. Currently, you have three clients: Gertrude, Jason, and MacDonald. Gertrude is old and can’t see. Her children still live with her, and she has hired an in-house assistant to help her get around. Jason is a single guy who has started his own business. MacDonald works on a farm with his children, and they also rent out a barn for people to stable their horses.
You have already gathered the tax forms that each client will need, but your filing software needs all of the tax forms to be merged into one file for each client. Rather than paying for expensive PDF editing software or using free, but time-consuming websites, you have chosen to write a program in python that will merge them automatically.
DISCLAIMER: This is not instructions on how to prepare or file taxes. This is simply an example of how to use python to merge pdf files.
Instructions
You can write the code however you want, but it must have the following components:
- Ask the user with an input statement to enter in the file path for the Challenge PDF Merger folder.
- The program should merge the tax forms in each client’s folder and create a “Merged Tax Forms.pdf”. There should be a total of three created files. One for each client. The solution for this challenge uses the pypdf library to merge the tax forms. Look under Suggestions and Hints to see how to install that library and the other libraries used for this challenge.
- The program should place the newly created “Merged Tax Forms.pdf” into each client’s folder.
- If that file already exists in each folder, the program should not append the new file to the old or leave a duplicate.
More detailed steps and instructions can be found under Suggestions and Hints if you want more guidance on how to create this program.
Data Files
Suggestions and Hints
The pypdf library contains several modules and tools you can use to edit pdfs. In a cell, try running the following to install the library:
- pip install pypdf
- If you get an error, try running this instead:
- pip install pypdf --user
I was able to create this program with the following functions and modules:
- from pypdf import PdfMerger
- from glob import glob
- import shutil
- import os
Here are more specific instructions for your code if you need them:
- Create a directory for the Challenge PDF Merger folder and a directory for wherever your python notebook operates. Include an os.makedirs statement in case you create filepaths that do not exist.
- Create a list of the client folders within the Challenge folder
- Loop through each client folder and create a directory
- Delete the Merged Tax Forms.pdf in case it already existed. This prevents the program from appending to the pre-existing file.
- Create a list of the tax forms within the client folder
- Append the pdf files using the PdfMerger module:
#Append the pdf files
merger = PdfMerger()
for pdf in pdfs:
merger. Append(pdf)
- Name the file “Merged Tax Forms.pdf” and then close the file
- Move the file from your working directory back into the client’s folder.
Solution
Challenge31_Solution.docx
Solution Video: Challenge 31|PYTHON – PDF Merger