70|PYTHON – Dart Gun Rates

BYU Student Author: @Mark
Reviewers: @Jonathan_Weston, @Andrew
Estimated Time to Solve: 45 Minutes

We provide the solution to this challenge using:

  • Python

Need a program? Click here.

Overview
You work as a staff accountant (aka “Number Gunner”) at Big Bill’s, a company that sells high quality dart guns in many different countries worldwide. The company uses sales report software that doesn’t automatically update exchange rates each day. Big Bill’s management (aka “Top Gunners”) changes the company’s fixed exchange rates in the system occasionally, but not very consistently. Your fellow Number Gunner needs to figure out the sales discrepancies stemming from differences between the company’s fixed rates and the actual daily exchange rates in December. He knows that you are very skilled with your Python Pistols and can figure out how to scrape the daily rates for the currencies he needs, so he asks you to help him. He promises to present you with a gold star for excellence if you can write code (aka “The Secret Weapon”) that will pull the exchange rate for each unique currency and date combination in the spreadsheet he provides you. You salute and return to your station to create the perfect Secret Weapon…

Instructions

  1. Import the packages you need to run your code. See “Suggestions and Hints” for the packages used in the solution. Also, change the float format to include 6 decimal places. This will ensure that your rates are very precise.
  2. Import the data into a dataframe called “salesdf”
    • Create and use variables named “directory” and “filename” to import the data. This will allow the other Number Gunner to easily replace your directory and file name with his own going forward.
  3. Create a new dataframe called “ratesdf” from salesdf (this will avoid issues later). Remove all columns you don’t need and all duplicates from ratesdf.
    • Remember that you need unique Currency Code and Shipped Date combinations. Also, are there any currencies that you don’t need to pull exchange rates for (ie, exchange rate = 1)? Does any of the data in the columns need to be transformed to work in step 4?
  4. Iterate through the rows of the dataframe, using the data from the row to pull rates from the following website: EUR Historical Exchange Rates (Euro) - X-Rates
    • You may need to include headers (user agent) in order to access the website data.
    • Locate and replace the currency code and date within the website address to pull the rates. Make sure you match the formatting!
    • Be sure you are also pulling the correct rate. You want the rate that will translate to USD by multiplying the rate and the sales amount. Add the rates as a new column in the dataframe.
  5. Merge ratesdf with salesdf to place the rates on the proper rows. Fill any nulls in the merged dataframe with “1”. Export the dataframe to an .xlsx file.

Data Files

Suggestions and Hints
  • The packages used in the solution code are pandas, datetime, os, and requests.
  • You will need to clean up the Shipped Date column to get rid of the timestamp.
  • Use an f-string for plugging the row information into the website when you iterate through the rows. See this link for information on f-strings.
  • Here is an example of header (user agent) code:
    user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) 
    Chrome/95.0.4638.54' 
    
    myheaders = {'User-Agent': user_agent}
    
  • The provided solution does not interact with the website to get the rates; it simply reads the html and pulls the rate from it. Techniques that interact with the website require more packages.

Solution

`import requests
import pandas as pd
import warnings
warnings.filterwarnings(“ignore”)
pd.set_option(‘display.max_columns’, None)
pd.set_option(‘display.max_rows’, None)
pd.options.display.float_format = ‘{:,.6f}’.format

from datetime import datetime as dt

filename = “Dec Sales”
directory = “Challenge70_Data.xlsx”
salesdf = pd.read_excel(directory, sheet_name=filename)
salesdf[‘Shipped Date’] = salesdf[‘Shipped Date’].dt.strftime(‘%Y-%m-%d’)
salesdf.head()

ratesdf = salesdf[[‘Currency Code’,‘Currency’,‘Shipped Date’]].drop_duplicates().reset_index()
ratesdf = ratesdf[ratesdf[‘Currency Code’]!=‘USD’]
ratesdf.head()

user_agent = ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36’
myheaders = {‘User-Agent’: user_agent}

for index, row in ratesdf.iterrows():
currency= row[‘Currency Code’]
date = row[‘Shipped Date’]
html = requests.get(f’{CURRENCY} Historical Exchange Rates () - X-Rates',
headers=myheaders).text
tables = pd.read_html(html)
fx = tables[0].iloc[0,1]
ratesdf.loc[index, ‘Rate’] = fx
display(ratesdf)

salesdf = pd.merge(salesdf,ratesdf, on = [‘Currency Code’,‘Shipped Date’],how = ‘left’)
salesdf = salesdf[[‘Shipped Date’,‘Currency Code’,‘Rate’,‘Order Total’]].fillna(1)
salesdf.to_excel(‘ExchangeRates.xlsx’,index= None)
salesdf.head()`

Here is the code I used to solve this problem! I tried to do the text splitting method to find the rates, but I could not get that to work.