The big gap between simulated and live performance can be largely explained by two common forces dominant in backtests—overfitting (or data-snooping bias) and ignoring transaction costs—both of which effectively bias investors’ return expectations higher than may be realistic.
Data-snooping risk. We all know there are no ugly backtests! More precisely, ugly-looking backtest results are rarely published in journals or client-facing materials. In the academic world, publication bias is well recognized, meaning that statistically significant results are three times more likely to be published than insignificant ones.
In our industry, quantitative managers are data mining every day in an attempt to identify signals that can accurately forecast a stock’s future return, and thus help improve a strategy’s performance. Smart beta strategies—model-driven strategies that involve the systematic selecting, weighting, and rebalancing of portfolio holdings based on factors or characteristics—are not exempt from this common practice. Importantly, this process should have proper guard rails to control data-snooping risk.
Even though a rich academic literature points out this problem and offers various solutions to mitigate it (McLean and Pontiff , Novy-Marx , and Harvey and Liu ), little has changed in practice. Investment managers still share their beautiful backtest results with investors, making few adjustments to the standard statistics. After all, who wants to make their results look worse?
We suggest a straightforward way for investors to establish realistic future return expectations. Backtests should be based on economically sound ideas that address the underlying relationship between signals and future performance. In analyzing a strategy, investors should consider who is on the other side of the trade, and why they would willingly choose to forgo the excess return the strategy is claiming to capture. Once the theory behind the excess return is established, the portfolio construction rules can be evaluated to assess their ability to best capture that excess return, after costs.
Take, for example, the Research Affiliates Fundamental Index™ (RAFI™), which is based on Research Affiliates’ central investment belief of long-horizon mean reversion. We believe investors have a bias toward owning more of what has very good 3- to 5-year returns and an aversion to owning securities that have fared poorly. In keeping with this theory, a disciplined rebalancing strategy will sell recent winners and buy recent losers to produce an excess return. Those on the other side of these trades will be doing the opposite, taking the more inherently comfortable path of favoring recent winners and shunning recent losers.
The Fundamental Index uses accounting metrics to provide a stable anchor for contra-trading. When the market overestimates the future prospects of a stock and thus prices it too high, the fundamental-based weighting methodology helps investors pull back their investment in the stock. When the price mean reverts to the level justified by its discounted future cash flows, RAFI delivers alpha over a market-capitalization-weighted index by avoiding an overallocation to the stock, which would otherwise arise from price inefficiency.
The choice of the accounting metrics used in the weighting methodology does not really matter. The goal of the metrics is simply to capture the economic footprint of a company, independent of market perception, as a means of offering large capacity to investors by directing greater allocations to companies with higher liquidity. When a backtest deviates from solid economic intuition and theoretical support, the data-mining exercise loses a lot credibility, and the results are useless, at best.
We strongly advocate for simplicity in smart beta methodologies to address data-snooping risk. The higher degrees of freedom in data mining, which are associated with a more complex methodology, give users more “knobs” to turn, potentially leading to stronger upward biases in in-sample outcomes. For example, an optimization-based approach, by its own definition, leads to the best in-sample return, volatility, or other targeted portfolio characteristic. While optimization and other complex methods of portfolio construction are very useful in obtaining certain objectives, adopting them simply due to their attractive in-sample performance is a dangerous practice.
Transaction costs. The other important factor that can explain disappointing live performance is implementation cost. Implementation costs are contributing to an ever larger portion of the gap between the expected performance of a smart beta index and its live record as the total amount of assets managed by these strategies rapidly grows.
The costs associated with executing a strategy are both explicit and implicit. The explicit costs, such as brokerage commissions and settlement/clearing charges, are directly observable, and explain a significant part of performance slippage, or the amount a fund’s return underperforms the index it is tracking. The implicit costs, referred to as market impact costs, are the changes in a stock’s price around index rebalancing dates, especially when the strategy’s assets under management are large; that is, the prices of stocks being purchased are temporarily inflated, and those being sold are temporarily depressed. As prices revert in the days following the rebalancing, the strategy loses money. This outcome is not easily observable in smart beta strategies because the impact is embedded in the return of the underlying index, whose value is calculated on the basis of closing prices.
For strategy implementers, whose primary goal is reducing tracking error, a rational response is to lump all trading around the market close so the portfolio can perfectly track the index. These clustered trades also happen to be the most costly because they are reducing, within a very short time span, already-limited liquidity. Chow et al. (2017), after studying various portfolio characteristics related to implementation, recommend spreading trades over several days around the rebalancing, if possible.
Another way to lower market impact costs is to avoid smart beta strategies that invest in stocks with low liquidity. Screening out micro-cap and thinly traded companies’ stocks is an important step in ensuring a strategy is “tradable,” even before considering the market impact of trades. Alphas produced “on paper” cannot successfully be reproduced when, for example, $10 million in buying power is attempting to take advantage of a mispricing opportunity in a stock of a company with a total market capitalization of $5 million. When they conduct simulations, thoughtful researchers will consider the trading volume of a stock, as well as set up proper constraints on the trades required by the strategy.
Investors who allocate to strategies, such as high dividend-growth, that typically require holding a relatively illiquid subset of the universe (Chow et al.) can apply a haircut to backtest results when setting their forward-looking return expectations. Illiquid stocks do offer more mispricing, and thus profit, opportunities, on average, because the price discovery process for these stocks is generally slower. Being attentive to the potential that the paper alphas of these strategies will likely be lower when they are live can shield investors from unpleasant surprises.
Strategies with high turnover rates, or when turnover occurs only for a few stocks rather than across the entire portfolio, also tend to experience high implementation costs. If this product feature is necessary to deliver the outcome investors seek, and no product design changes can address it effectively, sophisticated implementers can use algorithms to tactically take advantage of available liquidity. A momentum strategy falls into this category. For this reason, incorporating momentum in a passive smart-beta index strategy is very challenging.
As the popularity of smart beta strategies grows, the dollar volume of trades in the underlying securities—all competing for liquidity on rebalancing dates—likewise grows. This leads to higher market impact costs. Whereas the explicit costs of trading are decreasing over time as technology improves, we expect the implicit market impact costs associated with trading to increase. To help smart beta investors assess the market impact costs related to different strategies, we offer cost estimates based on Aked and Moroz (2015) on the Smart Beta Interactive tool on our website.
Saturday Night Live is the longest-running live television show in the United States. Viewers who tune in on Saturday nights know it’s live and it won’t be perfectly scripted. Likewise, investors who choose smart beta shouldn’t expect the perfect alpha production promised by a simulated backtest. After all, backtests don’t produce a single dollar, euro, or pound of investor benefit.
To improve the chance that the live results of smart beta strategies will produce the benefits investors expect, we suggest investors do three things:
Expect lower returns than the backtest produced. Backtest results can be an overly optimistic estimate of investors’ experience going forward because of data-snooping risk and the omission of transaction costs.
Dig deeper. In order to achieve the superior investment outcomes promised by smart beta strategies, investors need to make decisions cautiously and request asset managers provide out-of-sample test results as well as return estimates that incorporate implementation costs.
Use theory. Most importantly, we recommend that investors select strategies built on strong underlying economic theory and that have a simple, transparent, and intuitive methodology.