Visualize the trend with pandas rolling statistics
In today’s issue, I’m going to show you how to apply rolling statistics to stock prices with pandas.
“Rolling” a statistic applies a calculation to a chunk of data, slides (or rolls) the chunk forward, and does it again. It’s how all technical analysis calculations are done.
Today I’ll show you two examples: z-score and minimum return.
Learning how to apply rolling statistics unlocks the power of pandas:
- Identify outliers
- Visualize trends
- Apply predictive measures
Unfortunately, most people are lost when it comes to rolling statistics.
But you’re in luck!
I’m going to show you how to do it. Step by step.
Step 1: Get the data
I’m using Jupyter Notebook. I want to plot my charts inline, so I call %matplotlib inline first.
We’ll start by importing the libraries we need.
%matplotlib inline import yfinance as yf
Let’s get some data.
data = yf.download("NFLX", start="2020-01-01", end="2022-06-30")
We’ll use yfinance to get stock data – in this case, Netflix. You can use any stock and any price range you want.
Step 2: Define the function for z-score
The z-score is the number of standard deviations a value is away from it’s mean. It’s a great way to summarize where a value lies on a distribution.
For example, if you’re 189 cm tall, the z-score of your height might be 2.5. That means you are 2.5 standard deviations away from the mean height of everyone in the distribution.
The math is simple:
(value – average value) / standard deviation of values
Here’s what it looks like in Python.
def z_score(chunk): return (chunk[-1] - chunk.mean()) / chunk.std()
This function accepts a chunk of data. Then it takes the last value from the chunk, subtracts the mean (average), and divides by the standard deviation.
The [-1] means “take the last value from the chunk”.
Step 3: Create the rolling statistic
Creating the rolling z-score is one line of code using pandas.
rolled = data.Close.rolling(window=30).apply(z_score)
We use the closing price and apply the rolling function to it. The job of rolling is to take 30 rows of data and apply the z_score function to those rows. Then move forward one row, and do it again.
Now we can plot the z-score over time.
And as a histogram.
You can see a large negative z-score of -4.4. This is a -4.4 standard deviation move in the stock price! It corresponds to the -35% drop in NFLX on 20 April 2022.
Step 4: Rolling minimum return
Let’s take a look at the largest percentage drop over a rolling 30-day period.
min_pct_change = ( data .Close .pct_change() .rolling(window=30) .min() )
Here, we calculate the daily percentage change on the closing price. Then we apply the min function to the rolling window of data.
Here’s the plot.
And the histogram.
You can see Netflix has had a couple of pretty bad days!