How to use the information coefficient to measure your alpha

During my master’s program, my professors gave me a lot of theories I couldn’t use.

I remember my interest rate derivatives professor telling me:

“None of what I’m going to teach you actually works in practice.”

Now the context behind that statement was negative interest rates during the Great Financial Crisis.

But the point remains.

Of all the theories I couldn’t use, there were a few bright spots of practical wisdom.

One of those bright spots was factor investing and associated performance monitoring with the information coefficient.

One of my professors was a portfolio manager at a large hedge fund and he spent all his time building factor models and measuring their performance.

He spent the entire semester showing us exactly how.

It took me many months and several hundred lines of MATLAB code to get it right.

Fortunately, there is an open-source tool that does it in a few lines of Python.

How to use the information coefficient to measure your alpha

The information coefficient measures the correlation between a stock’s returns and the predicted returns from an alpha factor.

It’s a metric used to evaluate the effectiveness of an alpha factor in generating returns.

The information coefficient was first introduced by Fama and French in 1992, and it’s been widely used in quantitative finance ever since.

The information coefficient is important for retail traders, algorithm developers, and data analysts because it helps them determine which alpha factors are worth using in their trading strategies. Quants use the information coefficient to measure the predictive power of an alpha factor and to optimize their trading strategies.

Now you can too.

You will extend the momentum strategy you built last week by adding short positions to the portfolio and tracking the factor’s rank. These changes will let you use AlphaLens to assess the performance of the alpha factor.

Start with the imports. Define two constants, N_LONGS and N_SHORTS, which will be the number of securities the strategy will go long and short in the portfolio.

1import pandas as pd
2from scipy import stats
3import matplotlib.pyplot as plt
4
5from zipline import run_algorithm
6from zipline.pipeline import Pipeline, CustomFactor
7from zipline.pipeline.data import USEquityPricing
8from zipline.api import (
9    attach_pipeline,
10    calendars,
11    pipeline_output,
12    date_rules,
13    time_rules,
14    set_commission,
15    set_slippage,
16    record,
17    order_target_percent,
18    get_open_orders,
19    schedule_function
20)
21
22from alphalens.utils import (
23    get_clean_factor_and_forward_returns,
24    get_forward_returns_columns
25)
26from alphalens.plotting import plot_ic_ts
27from alphalens.performance import factor_information_coefficient
28
29import warnings
30warnings.filterwarnings("ignore")
31
32N_LONGS = N_SHORTS = 10

After the imports, load the Zipline extension and download data. You’ll need a free API from Nasdaq Data Link, which you can get here.

1%load_ext zipline
2! zipline ingest -b quandl

Create a custom momentum factor and build the pipeline

Now, create a custom factor to calculate the momentum and define the pipeline for our trading algorithm.

1class Momentum(CustomFactor):
2    # Default inputs
3    inputs = [USEquityPricing.close]
4
5    # Compute momentum
6    def compute(self, today, assets, out, close):
7        out[:] = close[-1] / close[0]
8
9def make_pipeline():
10
11    twenty_day_momentum = Momentum(window_length=20)
12    thirty_day_momentum = Momentum(window_length=30)
13
14    positive_momentum = (
15        (twenty_day_momentum > 1) & 
16        (thirty_day_momentum > 1)
17    )
18
19    return Pipeline(
20        columns={
21            'longs': thirty_day_momentum.top(N_LONGS),
22            'shorts': thirty_day_momentum.top(N_SHORTS),
23            'ranking': thirty_day_momentum.rank(ascending=False)
24        },
25        screen=positive_momentum
26    )

The factor divides the first price in the window by the last price. In other words, if the last price in the window is greater than the first price, there is momentum.

Define a function that creates a pipeline with two momentum calculations: one for 20-day momentum and another for 30-day momentum.

The Pipeline returns the top and bottom trending stocks that have positive momentum. It also returns their factor ranking.

Implement the trading strategy

Implement the trading strategy by initializing the pipeline, scheduling the rebalancing function, and executing the trades.

1def before_trading_start(context, data):
2    context.factor_data = pipeline_output("factor_pipeline")
3
4def initialize(context):
5    attach_pipeline(make_pipeline(), "factor_pipeline")
6    schedule_function(
7        rebalance,
8        date_rules.week_start(),
9        time_rules.market_open(),
10        calendar=calendars.US_EQUITIES,
11    )
12
13def rebalance(context, data):
14
15    factor_data = context.factor_data
16    record(factor_data=factor_data.ranking)
17
18    assets = factor_data.index
19    record(prices=data.current(assets, 'price'))
20
21    longs = assets[factor_data.longs]
22    shorts = assets[factor_data.shorts]
23    divest = set(context.portfolio.positions.keys()) - set(longs.union(shorts))
24
25    exec_trades(data, assets=divest, target_percent=0)
26    exec_trades(data, assets=longs, target_percent=1 / N_LONGS)
27    exec_trades(data, assets=shorts, target_percent=-1 / N_SHORTS)
28
29def exec_trades(data, assets, target_percent):
30    # Loop through every asset...
31    for asset in assets:
32        # ...if the asset is tradeable and there are no open orders...
33        if data.can_trade(asset) and not get_open_orders(asset):
34            # ...execute the order against the target percent
35            order_target_percent(asset, target_percent)

The before_trading_start function gets the pipeline output before every trading session. The initialize function attaches the pipeline and schedules the rebalancing function to run at the start of each week at market open.

The rebalance function gets the pipeline output, filters the assets to go long and short, and figures out which assets to divest. It then executes the trades using the exec_trades function.

In the exec_trades function, loop through the assets and check if they are tradable and if there are no open orders. If both conditions are met, place an order targeting the specified percentage of the portfolio.

And finally, run the algorithm.

1start = pd.Timestamp('2015')
2end = pd.Timestamp('2018')
3perf = run_algorithm(
4    start=start,
5    end=end,
6    initialize=initialize,
7    before_trading_start=before_trading_start,
8    capital_base=100_000,
9    bundle="quandl",
10)

Assess the factor performance with AlphaLens

Manipulate the price and factor data so AlphaLens can read them.

1prices = pd.concat(
2    [df.to_frame(d) for d, df in perf.prices.dropna().items()], 
3    axis=1
4).T
5
6prices.columns = [col.symbol for col in prices.columns]
7
8prices.index = prices.index.normalize()

Construct a DataFrame where each column is a stock and each row is a date. Do this by concatenating the DataFrames, each of which corresponds to a stock. Then convert the column names to strings and normalize the dates to midnight, UTC.

Now, do the same for the factor data.

1factor_data = pd.concat(
2    [df.to_frame(d) for d, df in perf.factor_data.dropna().items()],
3    axis=1
4).T
5
6factor_data.columns = [col.symbol for col in factor_data.columns]
7
8factor_data.index = factor_data.index.normalize()
9
10factor_data = factor_data.stack()
11
12factor_data.index.names = ['date', 'asset']

This time, create a MultiIndex DataFrame with the date as the first index and the asset as the index column. The column has the factor ranking.

Finally, you can analyze the factor returns using AlphaLens.

1alphalens_data = get_clean_factor_and_forward_returns(
2    factor=factor_data, 
3    prices=prices, 
4    periods=(5, 10, 21, 63), 
5    quantiles=5
6)

This is a utility function that creates holding period returns of 5, 10, 21, and 63 days. It includes the factor ranking and the factor quantile.

Now, you can get the information coefficient for each lagged return. For example, this is the historic information coefficient for a 5 day lag.

1ic = factor_information_coefficient(alphalens_data)
2
3plot_ic_ts(ic[["5D"]])
4plt.tight_layout()

How to use the information coefficient to measure your alpha. It’s a metric used to evaluate the effectiveness of an alpha factor in generating returns.

This chart shows how the information coefficient changes over time but is generally positive. This implies the momentum factor predicts returns.

You can also see the mean information coefficient of the factor for each holding period for each year.

1ic_by_year = ic.resample('A').mean()
2ic_by_year.index = ic_by_year.index.year
3ic_by_year.plot.bar(figsize=(14, 6))
4plt.tight_layout();

Throughout the backtest period, the momentum factor performed best on shorter time periods.

Professional money managers from Blackrock to Vanguard use factors to invest. Building factors and measuring their performance used to be reserved for the quants at these institutions. Now, you can use the same techniques they do.