# PQN #003: Evaluate A Real Trading Strategy With Python and Pandas

## PQN #003: Evaluate A Real Trading Strategy With Python and Pandas

In today’s issue, I’m going to show you how to evaluate a trading strategy with Python and pandas.

Fund managers report their holdings every month. They don’t want to tell investors they lost money on meme stocks. So at the end of the month, they sell low-quality assets and buy high-quality assets, like bonds.

We might be able to take advantage of this by buying bonds at the end of the month and selling them at the beginning of the month.

Why does this work? The edge is probably too messy, too small, or just not interesting to professionals. Which makes it perfect for us.

Most people over-complicate algorithmic trading – it really can be this simple.

Here’s how to investigate this effect with Python, step by step:

## Step 1: Get Data

We’re going to use TLT as a proxy for bonds. We’ll use the yFinance library to get 10 years of data in 1 line of code.

But first, the imports:

%matplotlib inline
import pandas as pd
import numpy as np

import yfinance as yf

I’m using Jupyter Notebook and want to plot my charts inline which is what %matplotlib inline does.

Let’s get the data.

## Step 2: Prepare Data

Let’s add a few columns to the DataFrame that we’ll use later.

First, we compute the log returns.

Then we’ll add a column for the calendar day of the month (1 – 31) and a column for the year.

tlt["day_of_month"] = tlt.index.day

tlt["year"] = tlt.index.year

## Step 3: Investigate Our Hypothesis

We expect there to be positive returns in TLT toward the end of the month. We expect this because we think fund managers buy TLT at the end of the month. We expect there to be negative returns in TLT toward the beginning of the month. This is when fund managers sell their high-quality assets and go back to buying meme stocks.

To see if this is true, we want the mean return on every day of the month.

grouped_by_day = tlt.groupby("day_of_month").log_return.mean()

Then it’s simple to plot:

grouped_by_day.plot.bar(title="Mean Log Returns by Calendar Day of Month")

We see evidence that returns are positive during the last days of the month and negative during the first.

This is for the entire range of data. Explore the persistence by grouping and averaging returns during different time frames.

## Step 4: Build A Simple Trading Strategy

Let’s build a naive strategy to test our hypothesis:

• Buy and hold TLT during the last week of the month
• Short and hold TLT during the first week of the month

Simple.

This code creates 3 new columns:

• first_week_returns – the daily log return if it’s between the 1st and 7th day of the month, otherwise 0
• last_week_returns – the daily log return if it’s on or after the 23rd day of the month, otherwise 0
• last_week_less_first_week – the difference between last_week_returns and first_week_returns

last_week_less_first_week represents the returns from our naive strategy. It’s basically saying “go long TLT the last week” and “go short TLT the first week”.

tlt["first_week_returns"] = 0.0
tlt.loc[tlt.day_of_month <= 7, "first_week_returns"] = tlt[
tlt.day_of_month <= 7
].log_return

tlt["last_week_returns"] = 0.0
tlt.loc[tlt.day_of_month >= 23, "last_week_returns"] = tlt[
tlt.day_of_month >= 23
].log_return

tlt["last_week_less_first_week"] = tlt.last_week_returns - tlt.first_week_returns

## Step 5: Plot Returns

Let’s create a naive backtest of our naive strategy to get a feel for the returns.

The point of this is not to have a highly accurate, statistically significant backtest. It’s to spend the shortest amount of time possible to see if this strategy is worth pursuing in more detail.

First, we’ll sum up the returns by year and plot them.

(
tlt.groupby("year")
.last_week_less_first_week.mean()
.plot.bar(title="Mean Log Strategy Returns by Year")
)

We see more evidence that this effect is persistent through time. Since 2002, there have only been 3 years where returns are negative.

Let’s take a look at the cumulative returns by year.

(
tlt.groupby("year")
.last_week_less_first_week.sum()
.cumsum()
.plot(title="Cumulative Sum of Returns By Year")
)

And we can do the same by day.

tlt.last_week_less_first_week.cumsum().plot(title="Cumulative Sum of Returns By Day")

There’s evidence that this effect is profitable. It’s worth spending more time exploring it more deeply.

Well, that’s it for today. I hope you enjoyed it.

See you again next week.