Portfolio optimization usually requires an estimate of the future returns of the assets in the portfolio. This is hard because we can’t see into the future.

*Traditional risk parity uses a quadratic optimizer*

A cutting edge technique called Hierarchical Risk Parity (HRP) uses graph theory and machine learning to build a hierarchical structure of the investments.

By the end of today’s newsletter, you’ll be able to create your own HRP-based portfolio among 25 sector-based ETFs.

Are you ready?

## Build state-of-the-art portfolios with machine learning

HRP was introduced by Marcos Lopez de Prado in a 2016 paper. HRP applies graph theory and machine learning to build a diversified portfolio based on the covariance matrix.

HRP is unlike traditional portfolio optimization methods. It can create an optimized portfolio when the covariance matrix is ill-degenerated or singular. This is impossible for quadratic optimizers.

Research has shown HRP to deliver lower out-of-sample variance than traditional optimization methods.

Let’s get started!

### Imports and set up

We’ll use the excellent RiskFolio-Lib to build our HRP portfolio and OpenBB for market data.

import pandas as pd from openbb import obb import riskfolio as rp

We’ll use market data from 25 sector-based ETFs to construct our portfolio.

assets = [ "XLE", "XLF", "XLU", "XLI", "GDX", "XLK", "XLV", "XLY", "XLP", "XLB", "XOP", "IYR", "XHB", "ITB", "VNQ", "GDXJ", "IYE", "OIH", "XME", "XRT", "SMH", "IBB", "KBE", "KRE", "XTL", ] data = ( obb .equity .price .historical(assets, provider="yfinance") .to_df() .pivot(columns="symbol", values="close") ) returns = data.pct_change().dropna()

This code uses OpenBB to download the market data as a DataFrame and generate the daily returns for each ETF.

### Build the optimal portfolio

We can plot the dendrogram to visualize which ETFs are clustered together.

ax = rp.plot_dendrogram( returns=returns, codependence="pearson", linkage="single", k=None, max_k=10, leaf_order=True, ax=None, )

The result is the following image.

The plot visualizes the hierarchical clustering of assets based on their historical return correlations. It illustrates how clusters of assets are merged at each hierarchical level and can give us insight into the correlation structure within a portfolio. The method takes asset returns and a clustering method to compute and plot the dendrogram.

Building the optimal portfolio based on the hierarchy is one line of code.

w port = rp.HCPortfolio(returns=returns) w = port.optimization( model="HRP", codependence="pearson", rm="MV", rf=0.05, linkage="single", max_k=10, leaf_order=True, )

The codependence parameter is set to “pearson” to use the Pearson correlation to measure the relationships between asset returns. The risk measure is set to “MV” for minimum variance which minimizes the portfolio’s overall volatility.

Additional parameters like linkage, max_k, and leaf_order are specified to fine-tune the clustering and dendrogram construction process.

The result is a pandas Series with the optimal weight for each of the assets.

### Visualize the results

RiskFolio-Lib makes it easy to visualize the results of the optimization.

ax = rp.plot_pie( w=w, title="HRP Naive Risk Parity", others=0.05, nrow=25, cmap="tab20", height=8, width=10, ax=None, )

This code generates a pie chart which with the weights of each asset.

We can also visualize the risk contribution of each asset.

ax = rp.plot_risk_con( w=w, cov=returns.cov(), returns=returns, rm="MV", rf=0, alpha=0.05, color="tab:blue", height=6, width=10, t_factor=252, ax=None, )

The risk contribution of each asset in a portfolio quantifies how much individual assets contribute to the total risk, considering both their own volatility and their correlation with other assets. We can see the highest risk contribution is from OIH which is an oil ETF.

Risk contribution is important for identifying assets that disproportionately increase portfolio risk.

### Next steps

We only scratched the surface of HRP. As a next step, add the assets in your portfolio and optimize it for a risk measure other than variance. You can try “MAD”, “CVaR”, or “VaR.” Check the documentation for details.