May cohort is now open: How to secure your spot:

The 2 tools for professional backtesting: Step-by-step Zipline for beginners

2 tools for professional backtesting: Step-by-step Zipline for beginners. Today, you use professional backtesting tools Zipline and PyFolio.

Backtesting is the only way to really test algorithmic trading strategies.

Doing it right will give you the best chance of making money.

Doing it wrong will guarantee a $0 account balance.

It’s not your fault: most backtesting frameworks lack features that reflect reality.

Things like commission, slippage, and special order types are all things algorithmic traders need to worry about.

If the framework you use doesn’t have these things, then you have two options:

❌ Waste time rebuilding your strategies in a “production” framework

❌ Lose money when your strategy starts trading live

Neither sounds great.

So what can you do about it?

Today, you use professional backtesting tools Zipline and PyFolio.

Zipline is an algorithmic trading simulator. It was built and maintained by Quantopian before being acquired by Robinhood in 2020. It’s is a professional backtesting framework which means you can use it for both research and live trading.

No more rebuilding your strategy.

Some of what Zipline offers:

✅ Ease of Use: Zipline gets out of your way so that you can focus on strategy development

Realistic: Zipline includes life-like slippage and commission models, order types, and order delays.

Batteries Included: Common statistics like moving average and linear regression are easy to use inside your strategy

PyData Integration: Historical data and output of performance statistics use pandas DataFrames to integrate into other tools

Statistics and Machine Learning Libraries: You can use libraries like scikit-learn to develop state-of-the-art trading systems

Suite of performance and analysis tools: Zipline works well with performance analysis tools like PyFolio, Alphalens, Empyrical

Imports and setup

Start with the imports.

You’ll use pandas_datareader to get data to compare your strategy with the S&P 500, matplotlib for charting, and PyFolio for performance analysis.

import pandas as pd
import pandas_datareader.data as web

import matplotlib.pyplot as plt

from zipline import run_algorithm
from zipline.api import order_target, record, symbol
from zipline.finance import commission, slippage

import pyfolio as pf

import warnings
warnings.filterwarnings('ignore')

Since you’re building the backtest in Jupyter Notebook, you need to load the Zipline “magics.” Running this lets you run the Zipline command line right in your Notebook.

%load_ext zipline

Ingesting free price data

Zipline creates data “bundles” for backtesting. You can build custom bundles to ingest any data you want.

Today, you’ll use the pre-built Quandl bundle to ingest price data between 2000 and 2018 for free.

! zipline ingest -b quandl

You will see Zipline working its magic to download the data and package it into highly efficient data stores.

Building the algorithm

Every Zipline strategy must have an initialize function. This is run at the beginning of the strategy.

Here, you set a counter to track the days, the symbol to trade, and set the commission and slippage models.

def initialize(context):
    context.i = 0
    context.asset = symbol("AAPL")

    context.set_commission(commission.PerShare(cost=0.01))
    context.set_slippage(slippage.FixedSlippage(spread=0.01))

Every Zipline strategy must also have a handle_data function.

This function is run at every “bar.” Depending on your data, it might run every minute or day. handle_data is where your strategy logic lives.

In today’s example, you will build a simple dual-moving average cross-over strategy.

def handle_data(context, data):
    # Skip first 50 days to get full windows
    context.i += 1
    if context.i < 50:
        return

    # Compute averages
    # data.history() has to be called with the same params
    # from above and returns a pandas dataframe.
    short_mavg = data.history(
        context.asset, 
        "price", 
        bar_count=14,
        frequency="1d"
    ).mean()
    
    long_mavg = data.history(
        context.asset,
        "price",
        bar_count=50,
        frequency="1d"
    ).mean()

    # Trading logic
    if short_mavg > long_mavg:
        # order_target orders as many shares as needed to
        # achieve the desired number of shares.
        order_target(context.asset, 100)
    elif short_mavg < long_mavg:
        order_target(context.asset, 0)

Use the counter to make sure there is enough data to compute the moving averages. If not, skip processing for the day.

If there is enough data, get 14 and 50 days’ worth of prices and calculate the moving average.

Then, execute the trading logic.

When the 14-day moving average crosses over a 50-day moving average, the strategy buys 100 shares. When the 14-day moving average crosses under the 50-moving average, it sells them.

Run the backtest

The first step is to define the start and end dates.

start = pd.Timestamp('2000')
end = pd.Timestamp('2018')

Then, get data to compare your strategy with the S&P 500.

sp500 = web.DataReader('SP500', 'fred', start, end).SP500
benchmark_returns = sp500.pct_change()

Finally, run the backtest.

perf = run_algorithm(
    start=start,
    end=end,
    initialize=initialize,
    handle_data=handle_data,
    analyze=analyze,
    capital_base=100000,
    benchmark_returns=benchmark_returns,
    bundle="quandl",
    data_frequency="daily",
)

Take a minute to explore the data in the perf DataFrame. There are 40 columns of rolling analytics! That’s the power of Zipline.

Analyze performance

Now that the backtest is finished, use PyFolio to get a breakdown of the results.

returns, positions, transactions = \
    pf.utils.extract_rets_pos_txn_from_zipline(perf)

pf.create_full_tear_sheet(
    returns,
    positions=positions,
    transactions=transactions,
    live_start_date="2016-01-01",
    round_trips=True,
)

This creates a full tear sheet based on your backtest results. There’s a ton of information here, but here are the highlights:

Performance analysis

2 tools for professional backtesting: Step-by-step Zipline for beginners. Today, you use professional backtesting tools Zipline and PyFolio.

Cumulative returns

2 tools for professional backtesting: Step-by-step Zipline for beginners. Today, you use professional backtesting tools Zipline and PyFolio.

Rolling volatility, Sharpe ratio, and drawdowns

2 tools for professional backtesting: Step-by-step Zipline for beginners. Today, you use professional backtesting tools Zipline and PyFolio.

Now you’re comfortable setting up the Zipline backtesting framework. By doing so, you can use the most powerful toolset for algorithmic trading.