Visualize the trend with pandas rolling statistics

In today’s issue, I’m going to show you how to apply rolling statistics to stock prices with pandas.

“Rolling” a statistic applies a calculation to a chunk of data, slides (or rolls) the chunk forward, and does it again. It’s how all technical analysis calculations are done.

Today I’ll show you two examples: z-score and minimum return.

Learning how to apply rolling statistics unlocks the power of pandas:

Identify outliers
Visualize trends
Apply predictive measures

Unfortunately, most people are lost when it comes to rolling statistics.

But you’re in luck!

I’m going to show you how to do it. Step by step.

Step 1: Get the data

I’m using Jupyter Notebook. I want to plot my charts inline, so I call %matplotlib inline first.

We’ll start by importing the libraries we need.

%matplotlib inline

import yfinance as yf

Let’s get some data.

data = yf.download("NFLX", start="2020-01-01", end="2022-06-30")

We’ll use yfinance to get stock data – in this case, Netflix. You can use any stock and any price range you want.

Step 2: Define the function for z-score

The z-score is the number of standard deviations a value is away from it’s mean. It’s a great way to summarize where a value lies on a distribution.

For example, if you’re 189 cm tall, the z-score of your height might be 2.5. That means you are 2.5 standard deviations away from the mean height of everyone in the distribution.

The math is simple:

(value – average value) / standard deviation of values

Here’s what it looks like in Python.

def z_score(chunk):
    return (chunk[-1] - chunk.mean()) / chunk.std()

This function accepts a chunk of data. Then it takes the last value from the chunk, subtracts the mean (average), and divides by the standard deviation.

The [-1] means “take the last value from the chunk”.

Step 3: Create the rolling statistic

Creating the rolling z-score is one line of code using pandas.

rolled = data.Close.rolling(window=30).apply(z_score)

We use the closing price and apply the rolling function to it. The job of rolling is to take 30 rows of data and apply the z_score function to those rows. Then move forward one row, and do it again.

Now we can plot the z-score over time.

rolled.plot()

PQN #006: Visualize the trend with pandas rolling statistics

And as a histogram.

rolled.hist(bins=20)

You can see a large negative z-score of -4.4. This is a -4.4 standard deviation move in the stock price! It corresponds to the -35% drop in NFLX on 20 April 2022.

Step 4: Rolling minimum return

Let’s take a look at the largest percentage drop over a rolling 30-day period.

min_pct_change = (
    data
    .Close
    .pct_change()
    .rolling(window=30)
    .min()
)

Here, we calculate the daily percentage change on the closing price. Then we apply the min function to the rolling window of data.

Here’s the plot.

min_pct_change.plot()

And the histogram.

min_pct_change.hist(bins=20)

You can see Netflix has had a couple of pretty bad days!

Connect With PyQuant News

80KFollowers

May cohort is now open: How to secure your spot:

Visualize the trend with pandas rolling statistics