How to backtest 2,000,000 simulations for the best exits

September 9, 2023
Facebook logo.
Twitter logo.
LinkedIn logo.

How to backtest 2,000,000 simulations for the best exits

Backtests are not a way to brute force optimize parameters to maximize a performance metric.

Doing that leads to overfitting and losses.

But optimization does play an important part in building trading strategies.

Today, we’ll see how.

How to backtest 2,000,000 simulations for the best exits

A backtest is a simulation of market dynamics which are used to test how trading strategies might perform behaved.

The problem is that the markets are noisy and difficult to model.

So when running backtests, optimized models often fit to noise, instead of market inefficiencies. When applying the optimized parameters to previously unseen data, the model falls apart (along with your portfolio).

All parameters of a strategy affect the result but only a few determine entry and exit dependent on the market price. These are the parameters that should be optimized.

For example, is a 1% trailing stop better than a $1 stop loss?

We’ll use the cutting-edge backtesting framework vectorbt to optimize the entry and exit type for a momentum strategy.

Strap on your seatbelt.

Let’s go!

Imports and set up

Given the power of what we’re about to do, there are very few imports required.

1import pytz
2from datetime import datetime, timedelta
3import numpy as np
4import pandas as pd
5import vectorbt as vbt

We’ll set some variables for the analysis.

1symbols = [
2    "META",
3    "AMZN",
4    "AAPL",
5    "NFLX",
6    "GOOG",
7]
8
9start_date = datetime(2018, 1, 1, tzinfo=pytz.utc)
10end_date = datetime(2021, 1, 1, tzinfo=pytz.utc)
11
12traded_count = 3
13window_len = timedelta(days=12 * 21)
14
15seed = 42
16window_count = 400
17exit_types = ["SL", "TS", "TP"]
18stops = np.arange(0.01, 1 + 0.01, 0.01)

vectorbt has built in data download capability using yFinance but it takes a bit of manipulation to get it right.

1yfdata = vbt.YFData.download(symbols, start=start_date, end=end_date)
2ohlcv = yfdata.concat()
3
4split_ohlcv = {}
5
6for k, v in ohlcv.items():
7    split_df, split_indexes = v.vbt.range_split(
8        range_len=window_len.days, n=window_count
9    )
10    split_ohlcv[k] = split_df
11ohlcv = split_ohlcv

The code downloads historical and is then concatenated into a single DataFrame. We use the vectorbt range_split method to evenly split the market data into separate lookbacks.

We then initialize an empty dictionary called split_ohlcv to store the split data. Finally, we iterate through each symbol's data and split it into smaller time windows and store the split data in the split_ohlcv dictionary.

Build the momentum strategy

Our strategy selects the top 3 stocks every split based on their mean return. The strategy equally allocates across the stocks at the beginning of the period, and exits at the end.

1momentum = ohlcv["Close"].pct_change().mean()
2
3sorted_momentum = (
4    momentum
5    .groupby(
6        "split_idx", 
7        group_keys=False, 
8        sort=False
9    )
10    .apply(
11        pd.Series.sort_values
12    )
13    .groupby("split_idx")
14    .head(traded_count)
15)
16
17selected_open = ohlcv["Open"][sorted_momentum.index]
18selected_high = ohlcv["High"][sorted_momentum.index]
19selected_low = ohlcv["Low"][sorted_momentum.index]
20selected_close = ohlcv["Close"][sorted_momentum.index]

The code calculates the momentum of each stock symbol based on the percentage change of their closing prices. It then sorts these values within each split and selects the top 3 stocks with the highest momentum.

Finally, it extracts the prices of the selected stocks using their indices and stores them in selected_open, selected_high, selected_low, and selected_close, respectively.

Test the order types

There’s a lot of code here. But we’re essentially creating the exit positions based on the different order types.

1entries = pd.DataFrame.vbt.signals.empty_like(selected_open)
2entries.iloc[0, :] = True
3
4sl_exits = vbt.OHLCSTX.run(
5    entries,
6    selected_open,
7    selected_high,
8    selected_low,
9    selected_close,
10    sl_stop=list(stops),
11    stop_type=None,
12    stop_price=None,
13).exits
14
15ts_exits = vbt.OHLCSTX.run(
16    entries,
17    selected_open,
18    selected_high,
19    selected_low,
20    selected_close,
21    sl_stop=list(stops),
22    sl_trail=True,
23    stop_type=None,
24    stop_price=None,
25).exits
26
27tp_exits = vbt.OHLCSTX.run(
28    entries,
29    selected_open,
30    selected_high,
31    selected_low,
32    selected_close,
33    tp_stop=list(stops),
34    stop_type=None,
35    stop_price=None,
36).exits
37
38sl_exits.vbt.rename_levels({"ohlcstx_sl_stop": "stop_value"}, inplace=True)
39ts_exits.vbt.rename_levels({"ohlcstx_sl_stop": "stop_value"}, inplace=True)
40tp_exits.vbt.rename_levels({"ohlcstx_tp_stop": "stop_value"}, inplace=True)
41ts_exits.vbt.drop_levels("ohlcstx_sl_trail", inplace=True)
42
43sl_exits.iloc[-1, :] = True
44ts_exits.iloc[-1, :] = True
45tp_exits.iloc[-1, :] = True
46
47sl_exits = sl_exits.vbt.signals.first(reset_by=entries, allow_gaps=True)
48ts_exits = ts_exits.vbt.signals.first(reset_by=entries, allow_gaps=True)
49tp_exits = tp_exits.vbt.signals.first(reset_by=entries, allow_gaps=True)
50
51exits = pd.DataFrame.vbt.concat(
52    sl_exits,
53    ts_exits,
54    tp_exits,
55    keys=pd.Index(exit_types, name="exit_type"),
56)

The code creates an empty DataFrame with the same shape as selected_open and sets the first row to True which is our entry point.

It then calculates three types of exit signals: stop-loss (sl_exits), trailing stop (ts_exits), and take-profit (tp_exits).

The levels of these DataFrames are renamed for clarity, and the last row of each DataFrame is set to True to ensure an exit at the end of the split.

Finally, we concatenate the exit signals into a single DataFrame called exits, with a new index level to differentiate between the three exit strategies.

Run and analyze the backtest

Now that we have our data, entries, and exits, we can run the optimization and analyze the results.

1portfolio = vbt.Portfolio.from_signals(selected_close, entries, exits)
2
3total_return = portfolio.total_return()
4
5total_return_by_type = total_return.unstack(level="exit_type")[exit_types]
6
7total_return_by_type[exit_types].vbt.histplot(
8    xaxis_title="Total return",
9    xaxis_tickformat="%",
10    yaxis_title="Count",
11)

The result shows a histogram of total returns based on each stop type.

How to backtest 2,000,000 simulations for the best exits. Backtests are not a way to brute force optimize parameters to maximize performance.
How to backtest 2,000,000 simulations for the best exits. Backtests are not a way to brute force optimize parameters to maximize performance.

We can see that the trailing stop seems to have more negative returns while the take-profit has more positive returns. Let’s take a different look.

1total_return_by_type.vbt.boxplot(
2    yaxis_title='Total return',
3    yaxis_tickformat='%'
4)

The result is a box plot with the order types.

How to backtest 2,000,000 simulations for the best exits. Backtests are not a way to brute force optimize parameters to maximize performance.
How to backtest 2,000,000 simulations for the best exits. Backtests are not a way to brute force optimize parameters to maximize performance.

We can use the box plot to confirm that the take-profit order type outperforms the two others. Let’s quantify it further.

1total_return_by_type.describe(percentiles=[])

You’ll see that the mean total return for the take-profit order type is 12.3% while that for the other two are below 10%.

Next steps

vectorbt is an advanced library suitable for walk-forward analysis and optimization.

The next step is to get get the code running on your machine and learn more about the framework by reading the documentation.If you’ve been a reader of this newsletter for a while, or a student of Getting Started With Python for Quant Finance, you’ll recognize this statement:

Man with glasses and a wristwatch, wearing a white shirt, looking thoughtfully at a laptop with a data screen in the background.