BYU Student Author: @Mike_Paulsin
Reviewers: @Jae, @Nate
Estimated Time to Solve: 30 - 45 Minutes
We provide the solution to this challenge using:
- Python
Need a program? Click here.
Overview
Web scraping is the process of extracting data from websites automatically using code, rather than manually copying and pasting data. Python is a popular language for web scraping due to its ease of use, large number of web scraping libraries available, and its ability to handle a wide variety of web data formats.
In this challenge, use Python and the requests and BeautifulSoup libraries to extract financial data from Yahoo Finance for a given stock ticker symbol. The program should send a GET request to the Yahoo Finance website with the ticker symbol as a parameter in the URL, and then parse the HTML response to extract the desired financial data from the HTML elements on the page.
Instructions
- Ask the user to enter a stock ticker symbol (e.g. AAPL, MSFT, TSLA).
- Use the requests library to send a GET request to the Yahoo Finance website with the ticker symbol as a parameter in the URL. (Remember to include headers in your request!)
- Parse the HTML response and extract the following financial data:
- Market open price
- Previous market close price
- Market capitalization
- Price-to-earnings ratio (P/E ratio)
- Dividend yield (if available)
- Print out the extracted financial data in a readable format.
Example Output
Name: Apple Inc.
Open: 159.37
Close: 157.65
Market Cap: 2.544T
P/E Ratio: 27.25
Dividend Yield: 0.92 (0.58%)
Good luck and happy coding!
Suggestions and Hints
The URL for the Yahoo Finance page for a given stock ticker symbol looks like this: Symbol Lookup from Yahoo Finance<TICKER_SYMBOL>
There are a few ways to achieve the desired output. Some possible ways to go about solving this might include the following:
Use your browser’s developer tools to inspect the page and find the appropriate elements. You can then reference these elements in your python code using xpath.
Use text parsing to parse out the desired information found in your request. Convert the requested data to a string type. Then using the split function in python, parse out the unnecessary content to get to your desired element.
Use BeautifulSoup and the get_text() method to extract the text content of an HTML element.
Solution
Solution Code
user_agent = 'Me'
myheaders = {'User-Agent': user_agent}
import requests
from bs4 import BeautifulSoup
# Ask user for stock ticker symbol
ticker = input("Enter a stock ticker symbol: ")
# Send GET request to Yahoo Finance and parse HTML response
url = f"https://finance.yahoo.com/quote/{ticker}"
response = requests.get(url,headers=myheaders)
soup = BeautifulSoup(response.content, "html.parser")
soup = str(soup)
# Extract financial data from HTML elements using text parsing
name = soup.split('<title>')[1].split('(')[0]
open = soup.split('OPEN-value">')[1].split('<')[0]
close = soup.split('PREV_CLOSE-value">')[1].split('<')[0]
marketcap = soup.split('MARKET_CAP-value">')[1].split('<')[0]
PEratio = soup.split('PE_RATIO-value">')[1].split('<')[0]
divyield = soup.split('DIVIDEND_AND_YIELD-value">')[1].split('<')[0]
#print output
print(f"Name: {name}")
print(f"Open: {open}")
print(f"Close: {close}")
print(f"Market Cap: {marketcap}")
print(f"P/E Ratio: {PEratio}")
print(f"Dividend Yield: {divyield}")
Solution Video: Challenge 44|PYTHON – Webscraping Introduction