PQN #017: Build and run a backtest like the pros

Build and run a backtest like the pros

Build and run a backtest like the pros

In today’s issue, I’m going to show you how to build a backtest for a trading strategy.

A backtest is a way to test trading ideas against historic market data. It’s a simulation of how the strategy might have performed in the market. Usually, traders will optimize performance metrics like the Sharpe ratio by tweaking the input parameters like the lookback period.

Unfortunately, most beginners spend all their time tweaking backtests only to find they don’t work in real life. Even with out-of-sample data, cross-validation, and walk-forward analysis, backtest results are often way off. The majority of trading systems with a positive backtest are actually unprofitable.


Treat backtesting as an experiment

Suppose your strategy has $0 expected profit (i.e. it trades randomly) but you don’t know it. A random strategy will produce positive results in 50% of cases and negative results in 50% of cases. Results will rarely be $0.

What do most people do with a negative backtest result? Tweak the parameters until it’s positive. To get around this problem, professionals will backtest their backtest.

I’m going to show you how.

By the end of this issue, you will know how to:

  • Set up a backtest with bt
  • Run a backtest and analyze the results
  • Perform a Monte Carlo simulation on your backtest

All in Python.

Let’s get started!

Step 1: Imports and set up

bt is a flexible backtesting framework for Python used to test quantitative trading strategies. Import NumPy and Matplotlib too.

%matplotlib inline
import bt
import matplotlib.pyplot as plt

Fund managers report their holdings every month. They don’t want to tell investors they lost money the latest meme stock. So they sell their meme stocks and buy higher quality assets, like bonds.

We might be able to take advantage of this effect by buying bonds toward the end of the month and selling them at the beginning.

Start by getting data for the bond ETF, TLT.

data = bt.get("tlt", start="2010-01-01", end="2022-06-30")

Then create the functions to run the backtest.

def build_strategy(weights):
    return bt.Strategy(

The first takes daily portfolio weights. bt has a library of built-in algos which takes care of the logic for you. This function weights the portfolio based on the input and rebalances daily.

def build_backtest(strategy, df, initial_capital, commission_model):
    return bt.Backtest(


The next function takes the strategy you just built, market data, initial capital, and a commission model.

def commission_model(q, p):
    # p is price, q is quantity
    val = abs(q * p)
    if val > 2000:
        return 8.6
    if val > 1000:
        return 4.3
    if val > 100:
        return 1.5
    return 1.0

Your commission model can be anything you want. It just needs price and quantity.

def add_dom(df):
    # add the day of month and return
    added = df.copy()
    added["day_of_month"] = df.index.day
    return added

Next, create a function that adds a column to the DataFrame with the day of the month. I want to be long TLT for the last week of the month and short during first. To do this, I need to know the day of the month.

def add_weights(df, symbol):
    # start with no position within the month
    strategy = df[[symbol]].copy()
    # start with no position within the month
    strategy.loc[:] = 0
    # short within the first week of the month
    strategy.loc[df.day_of_month <= 7] = -1

    # long during the last week of the month
    strategy.loc[df.day_of_month >= 23] = 1
    return strategy

The last function weights the portfolio 100% short during the first week of the month and 100% long during the last week of the month. All other days the strategy is out of the market.

initial_capital = 10_000

Finally, set the initial capital.

Step 2: Run the initial backtest

Now that everything is in place, run the backtest.

# add the day of month
data_with_dom = add_dom(data)

# get the portfolio weights
weights = add_weights(data_with_dom, 'tlt')

# build the bt strategy
strategy = build_strategy(weights)

#build the backtest
backtest = build_backtest(strategy, data, initial_capital, commission_model)

#run the backtest
first_res = bt.run(backtest)

This code will setup the strategy and run the backtest. bt makes it easy to see the results.


This prints performance statistics about the strategy. Make note of the daily Sharpe which we’ll use next.

bt makes it easy to plot the results, too.

first_res.plot(figsize=(20, 10))
Build and run a backtest like the pros

And the weights.

first_res.plot_weights('wd', figsize=(20, 5))
Build and run a backtest like the pros

Step 3: Backtest the backtest

We need one more function.

def shuffle_prices(df):
    # randomly shuffle the prices without replacement
    shuffled = df.sample(frac=1)
    # reset the index
    shuffled.index = df.index
    return shuffled

This shuffles the prices and resets the date index.

Why do we do this?

I’m going to run a simulation of 1,000 backtests. I plot the resulting Sharpe ratios and see where the backtest result is on the distribution.

runs = 1000
initial_sharpe = first_res['wd'].daily_sharpe
sharpes = []

Set the number of runs and grab the daily Sharpe ratio from the first backtest. Finally, create a list to capture the Sharpe ratios.

for run in range(0, runs):
    # shuffle the prices
    shuffled = shuffle_prices(data)
    # add the day of month
    shuffled_with_dom = add_dom(shuffled)
    # add the weights
    weights = add_weights(shuffled_with_dom, 'tlt')
    # build the strategy
    strategy = build_strategy(weights)
    # build the backtest
    backtest = build_backtest(strategy, shuffled_with_dom, initial_capital, comm)
    # run the backtest
    res = bt.run(backtest)
    # accumulate sharpe ratios
    sharpe = res['wd'].daily_sharpe

This loop runs the backtest against randomly shuffled prices. It then accumulates the Sharpe ratios which are based on the random data.

Finally, find out where the Sharpe ratio is in the distribution of random backtest results.

dist = plt.hist(sharpes, bins=10)
plt.axvline(initial_sharpe, linestyle='dashed', linewidth=1)
Build and run a backtest like the pros

Run a simple P-test to test significance. The p-value is N / runs where N is the number of random results that are better than our strategy.

N = sum(i > initial_sharpe for i in sharpes)
p_value = N / runs

Very few randomized tests have a better result than our backtest. Indeed, the p-value is below 1%, meaning a high significance of our backtest. This gives us some confidence that the strategy can achieve a similar result in real trading.

Well, that’s it for today. I hope you enjoyed it.

See you again next week.