Evaluate A Real Trading Strategy With Python and Pandas
Evaluate A Real Trading Strategy With Python and Pandas
Evaluate A Real Trading Strategy With Python and Pandas
In today’s issue, I’m going to show you how to evaluate a trading strategy with Python and pandas.
Fund managers report their holdings every month. They don’t want to tell investors they lost money on meme stocks. So at the end of the month, they sell low-quality assets and buy high-quality assets, like bonds.
We might be able to take advantage of this by buying bonds at the end of the month and selling them at the beginning of the month.
Why does this work? The edge is probably too messy, too small, or just not interesting to professionals. Which makes it perfect for us.
Most people over-complicate algorithmic trading - it really can be this simple.
Here’s how to investigate this effect with Python, step by step:
Step 1: Get Data
We’re going to use TLT as a proxy for bonds. We’ll use the yFinance library to get 10 years of data in 1 line of code.
But first, the imports:
1%matplotlib inline
2import pandas as pd
3import numpy as np
4
5import yfinance as yf
I’m using Jupyter Notebook and want to plot my charts inline which is what %matplotlib inline does.
Let's get the data.
1tlt = yf.download("TLT", start="2002-01-01", end="2022-06-30")
This downloads 5,015 days of price history into a DataFrame.
Step 2: Prepare Data
Let's add a few columns to the DataFrame that we’ll use later.
First, we compute the log returns.
1tlt["log_return"] = np.log(tlt["Adj Close"] / tlt["Adj Close"].shift(1))
Then we’ll add a column for the calendar day of the month (1 - 31) and a column for the year.
1tlt["day_of_month"] = tlt.index.day
2
3tlt["year"] = tlt.index.year
Step 3: Investigate Our Hypothesis
We expect there to be positive returns in TLT toward the end of the month. We expect this because we think fund managers buy TLT at the end of the month. We expect there to be negative returns in TLT toward the beginning of the month. This is when fund managers sell their high-quality assets and go back to buying meme stocks.
To see if this is true, we want the mean return on every day of the month.
1grouped_by_day = tlt.groupby("day_of_month").log_return.mean()
Then it’s simple to plot:
1grouped_by_day.plot.bar(title="Mean Log Returns by Calendar Day of Month")
We see evidence that returns are positive during the last days of the month and negative during the first.
This is for the entire range of data. Explore the persistence by grouping and averaging returns during different time frames.
Step 4: Build A Simple Trading Strategy
Let’s build a naive strategy to test our hypothesis:
- Buy and hold TLT during the last week of the month
- Short and hold TLT during the first week of the month
Simple.
This code creates 3 new columns:
- first_week_returns - the daily log return if it’s between the 1st and 7th day of the month, otherwise 0
- last_week_returns - the daily log return if it's on or after the 23rd day of the month, otherwise 0
- last_week_less_first_week - the difference between last_week_returns and first_week_returns
last_week_less_first_week represents the returns from our naive strategy. It's basically saying "go long TLT the last week" and "go short TLT the first week".
1tlt["first_week_returns"] = 0.0
2tlt.loc[tlt.day_of_month <= 7, "first_week_returns"] = tlt[
3 tlt.day_of_month <= 7
4].log_return
5
6tlt["last_week_returns"] = 0.0
7tlt.loc[tlt.day_of_month >= 23, "last_week_returns"] = tlt[
8 tlt.day_of_month >= 23
9].log_return
10
11tlt["last_week_less_first_week"] = tlt.last_week_returns - tlt.first_week_returns
Step 5: Plot Returns
Let’s create a naive backtest of our naive strategy to get a feel for the returns.
The point of this is not to have a highly accurate, statistically significant backtest. It's to spend the shortest amount of time possible to see if this strategy is worth pursuing in more detail.
First, we’ll sum up the returns by year and plot them.
1(
2 tlt.groupby("year")
3 .last_week_less_first_week.mean()
4 .plot.bar(title="Mean Log Strategy Returns by Year")
5)
We see more evidence that this effect is persistent through time. Since 2002, there have only been 3 years where returns are negative.
Let’s take a look at the cumulative returns by year.
1(
2 tlt.groupby("year")
3 .last_week_less_first_week.sum()
4 .cumsum()
5 .plot(title="Cumulative Sum of Returns By Year")
6)
And we can do the same by day.
1tlt.last_week_less_first_week.cumsum().plot(title="Cumulative Sum of Returns By Day")
There's evidence that this effect is profitable. It's worth spending more time exploring it more deeply.