96|PYTHON – Investment Visualizations

BYU Student Author: @Hyrum
Reviewers: @Donovon, @Erick_Sizilio
Estimated Time to Solve: 20 Minutes

We provide the solution to this challenge using:

  • Python

Need a program? Click here.

You and your friend just returned from a game-changing talk about smart investing, and you’re feeling invigorated and ready to start your investment journey. But you’re not going to settle for just any kind of investing - you want to do it the smart way.

You and your friend decide to take on the ultimate challenge and investigate some companies. Your goal? To find the best investment opportunities and earn passive income like a boss!

You come across a spreadsheet filled with data about different companies, and you’re eager to start analyzing. But there’s a catch, you don’t have the fancy software like Tableau or Alteryx to make the kind of graphs you want. But don’t worry, you’ve got a secret weapon: your top-notch python skills!

With determination and a can-do attitude, you start creating some seriously impressive tables in Python. You’re not going to let a lack of resources hold you back - you’re going to make this happen no matter what.

As you work, you can feel your excitement growing. You’re learning new things, honing your skills, and taking control of your financial future like a pro. And when you’re finished, you’ve got some seriously impressive visualizations that are going to take your investment game to the next level.

Upload the datasheet from the link as a DataFrame and fill all the null values with 0. Then make the following graphs.

  1. Make a bar graph that shows the top ten revenue growth companies with their revenue growths shown as percentages, like the graph below:

  2. After considering the matter, you may wonder if pursuing revenue growth is the optimal course of action. While revenue growth is typically an indicator of a company’s expansion, it does not guarantee a positive return for investors. As such, it may be beneficial to assess the return on equity, which is a traditional metric used to gauge investor returns. To evaluate the relationship between revenue growth and return on equity, a regression analysis can be conducted. Conduct that regression analysis by making a scatterplot and making a regression line through it.

  3. You have decided to gather additional information and have concluded that return on equity is more critical than revenue growth information. While revenue growth can provide insight into a company’s future, it may not have a direct impact on you as an investor. Therefore, you have chosen to focus on evaluating return on equity, enterprise value, and debt-to-equity measures. You plan to analyze all three measures simultaneously by creating a 3D graph with these metrics as the different axes. Additionally, you plan to use the short name of the company as a color-coding mechanism to differentiate them on the graph.

Data Files

Suggestions and Hints

Use matplotlib, pandas, seaborn, numpy, scipy.stats, plotly.express

Here are some resources that can help with the application of these codes:

For the bar graph, follow these steps:

  • Filter the DataFrame to show only the top 10 companies with the highest revenue growth.
  • Sort the top 10 companies by the revenueGrowth column in descending order.
  • Plot a bar graph of the top 10 companies with the highest revenue growth, sorted by value.
  • Get the largest revenueGrowth amount.
  • Add a text message to the side of the graph with the largest revenueGrowth amount.
  • Add a y-label.
  • Display the y-axis labels as percentages.
  • Display the graph.

For the regression analysis, follow these steps:

  • Create the scatter plot with seaborn.
  • Add axis labels and title.
  • Calculate the linear regression line using scipy.stats.linregress.
  • Plot the linear regression line.
  • Plot the r value, p-value, r square value, and linear regression equation on the image as text.
  • Show the plot.

For the 3-D graph, follow these steps:

  • Select the relevant columns for the 3D scatter plot.
  • Create the 3D scatter plot.
  • Display the graph.

For further hints, here is what the second two graphs are meant to look like:



Upload the datasheet as a DataFrame and fill all the null values with 0.


Bar Graph


Regression Analysis


3-D Graph


Solution Video: Challenge 96|PYTHON – Investment Visualization

I really liked this challenge! It was especially cool to use a 3-D graph!
Here is my solution!

Here is my code!

import matplotlib.pyplot as plt
import pandas as pd
from scipy.stats.mstats import winsorize, trim
import numpy as np
import math
from glob import glob
import warnings
import seaborn as sns

df.fillna(0, inplace=True)
df.sort_values(by=‘revenueGrowth’, ascending=False,inplace=True)

plt.bar(df_10[‘shortName’],df_10[‘revenueGrowth’], label=‘revenueGrowth’)
plt.ylabel(‘Revenue Growth’)
plt.xlabel(‘Short Name’)
plt.legend(loc=‘upper right’)
multiline_text = “Largest revenue Growth amount:\n30.65%”
plt.text(2.5, 15, multiline_text, ha=‘left’, va=‘center’, fontsize=14) ## adds text

import matplotlib.pyplot as plt
import numpy as np
from scipy.stats import linregress
##Generate some example data
##Perform linear regression
slope, intercept, r_value, p_value, std_err = linregress(df[‘revenueGrowth’], df[‘returnOnEquity’])
##Print the regression parameters
print(f"Slope: {slope}“)
print(f"Intercept: {intercept}”)
print(f"R-squared value: {r_value2}“)
print(f"P-value: {p_value}”)
print(f"Standard error: {std_err}")
##Calculate the regression line
slope, intercept = np.polyfit(df[‘revenueGrowth’], df[‘returnOnEquity’], 1)
regression_line = slope * df[‘revenueGrowth’] + intercept
##Create a scatter plot
plt.scatter(df[‘revenueGrowth’], df[‘returnOnEquity’], label=‘Data Points’)
multiline_text = f"slope:{slope}\nr-squared:{(r_value
2).round(3)} \nP-value: {p_value}\nregression equation: {slope}x + {intercept.round(2)}"
plt.text(5, 10, multiline_text, ha=‘left’, va=‘center’, fontsize=14) ## adds text
##Plot the regression line
plt.plot(df[‘revenueGrowth’], regression_line, color=‘red’, label=‘Regression Line’)
##Add labels and legend
plt.xlabel(‘Revenue Growth’)
plt.ylabel(‘Return on Equity’)
##Show the plot

import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import seaborn as sns
import random
##Sample data
##Create a figure and a 3D axis
fig = plt.figure()
ax = fig.add_subplot(111, projection=‘3d’)
##Plot a 3D line
##ax.scatter(df[‘enterpriseValue’],df[‘returnOnEquity’], df[‘debtToEquity’], label=‘3D Line’)
##Set axis labels
ax.set_ylabel(‘Return On Equity’)
ax.set_zlabel(‘Debt to Equity’)
ax.set_xlabel(‘Enterprise Value’)
##Generate random colors for each point
num_points = len(df[‘shortName’])
colors = sns.color_palette(“husl”, n_colors=num_points)
##Plot 3D points with names and random colors
for name, xi, yi, zi, color in zip(df[‘shortName’],df[‘enterpriseValue’],df[‘returnOnEquity’], df[‘debtToEquity’], colors):
ax.scatter(xi, yi, zi, c=[color], marker=‘o’, label=name)
##Add a legend
legend = ax.legend(loc=‘upper left’, bbox_to_anchor=(1.25, 1), prop={‘size’: 8})
##Show the plot

This was great! Here is my solution

import matplotlib.pyplot as plt

import pandas as pd

import matplotlib.ticker as mtick

import seaborn as sms

import numpy as np

from scipy.stats import linregress

import plotly.express as px

df = pd.read_excel(“Challenge96_Data.xlsx”)

df = df.fillna(0)

top_10 = df.nlargest(10, ‘revenueGrowth’)

top_10 = top_10.sort_values(‘revenueGrowth’, ascending = False)


max_revenueg = top_10[‘revenueGrowth’].max()

fig, ax = plt.subplots(figsize=(10, 6))

bars = ax.bar(top_10[‘shortName’], top_10[‘revenueGrowth’])

ax.set_ylabel(‘Revenue Growth’)

ax.set_xlabel(‘Short Name’)

Rotate x-tick labels for better readability

ax.set_xticklabels(top_10[‘shortName’], rotation=45, ha=‘right’)

Adjust layout to prevent clipping of tick-labels


ax.set_ylim(0, max_revenueg)

ax.text(.4, 0.5, f’Largest revenueGrowth amount:\n{(max_revenueg/100):.2%}', transform=ax.transAxes, fontsize=12, va=‘center’)

fmt = ‘%.0f%%’

yticks = mtick.FormatStrFormatter(fmt)


ax.legend(loc=‘upper right’)


plt.scatter(df[‘revenueGrowth’], df[‘returnOnEquity’])

plt.xlabel(‘Revenue Growth’)

plt.ylabel(‘Return on Equity’)

plt.title(‘Scatter Plot of Revenue Growth vs. Return on Equity’)

x = df[‘revenueGrowth’]

y = df[‘returnOnEquity’]

slope, intercept, r_value, p_value, std_err = linregress(x,y)

plt.plot(x, intercept + slope * x, color=‘lightblue’)


three_df = df[[‘shortName’, ‘returnOnEquity’, ‘enterpriseValue’, ‘debtToEquity’]]

all_fig = px.scatter_3d(three_df, x = ‘returnOnEquity’, y = ‘enterpriseValue’, z=‘debtToEquity’, color = ‘shortName’)


Screenshot 2023-12-14 142458


Screenshot 2023-12-14 142449