Many investors expect the alpha of a strategy to be its historical alpha, so much so that this assumption itself is an example of an alpha-forecasting model. One of the cornerstones of any investment process is an estimate of forward-looking return. We argue that a good alpha-forecasting model, whether for a strategy or a factor tilt, should have three key attributes:
1. Forecasts should correlate with subsequent alphas.
2. Forecasts should be paired with a measure of the likely accuracy of the forecast. A standard statistical way to measure the accuracy of a forecast is mean squared error, a measure of how reality has differed from past forecasts.
3. Forecasts should provide realistic estimates of expected returns.
These criteria provide useful metrics for us to compare different alpha-forecasting models. We select six models for comparison. One model assumes an efficient market: no factors or strategies have any alpha. Two of the models use only past performance and ignore valuations, and four of the models are based on valuation levels relative to historical norms.
Model 0. Zero factor alpha. In an early version of the efficient market hypothesis—the capital asset pricing model, or CAPM—researchers argued that an asset’s return was solely determined by its exposure to the market risk factor. Similarly, Model 0 assumes the risk-adjusted alpha of a factor tilt or smart beta strategy is approximately zero. We measure the mean squared error relative to an expected alpha of zero.
Model 1. Recent past return (most recent five years). This model uses the most recent five-year performance of a factor or strategy to forecast its future return. Because our research tells us that investors who select strategies based on wonderful past performance are likely buying stocks with high valuations, we expect this model will favor the strategies that are currently expensive and have low future expected returns.
Model 2. Long-term historical past return (inception to date). Long-term historical factor returns are perhaps the most widely accepted way to estimate factor premiums (expected returns), both in the literature and in the practitioner community. Doing so requires that we extrapolate historical alpha to make the forecast: what has worked in the past is deemed likely to work in the future. Averaging performance over a very long period of time should theoretically mitigate vulnerability to end-point richness.2 By using multiple decades of history (versus a short five-year span as Model 1 does), we would expect this model to perform relatively well in differentiating well-performing factors from less-well-performing ones.
Model 3. Valuation dependent (overfit to data). This model is a simple and intuitive valuation-dependent model, as illustrated by the log-linear line of best fit in Figure 1.3At each point in time, we calibrate the model only to the historically observed data available at that time; no look-ahead information is in the model calibration. This model encourages us to buy what’s become cheap (performed badly in the past), rather than chasing what’s become newly expensive (has performed exceptionally well).
Model 4. Valuation dependent (shrunk parameters). A model calibrated using past results may be overfitted, and as a result provide exaggerated forecasts that are either too good or too bad to be true. Parameter shrinkage is a common way to reduce model overfitting to rein in extreme forecasts. (Appendix A provides more information on how we modify the parameters estimated in Model 4 to less extreme values.)
Model 5. Valuation dependent (shrunk parameters with variance reduction). Model 5 further shrinks Model 4 by dividing its output by two. The output of this model is perfectly correlated with the output of Model 4, with the forecast having exactly two times lower variability.
Model 6. Linear model (look-ahead calibration). Model 6 allows look-ahead bias. With our log-linear valuation model we estimate using the full sample. Of course, this model will deliver past “forecasts” that are implausibly good because no one has clairvoyant powers! Nevertheless, it provides a useful benchmark—a model that, by definition, has perfect fit to the data—against which we can compare our other models. How close can we come to this impossible ideal?
For our model comparison we use the same eight factors in the US market as we use in our previously published research. (The description of our factor construction methodology is available in Appendix B.) We use the first 24 years of data (Jan 1967–Dec 1990) in the initial model calibration, encompassing several valuation cycles, and use the remaining data (Jan 1991–Oct 2011) to run the model comparison. These data end in 2011 because we are forecasting subsequent five-year performance; an end date in October allowed us to conduct our model comparison analysis in November and December. We report the comparison results in Table 1. Model 0 and Model 2 are our base cases. We need to beat a static zero-alpha assumption (Model 0) in order to even argue for the use of dynamic models in alpha forecasting. And we need to beat Model 2 to demonstrate the usefulness of a valuation-based forecasting model.
Assuming that future alpha is best estimated by the past five years of performance, Model 1 provides the least accurate forecast of alpha (i.e., based on mean squared error (MSE), it performs the worst of all six models). Further compounding its poor predictive ability, its forecasts are negatively correlated with subsequent factor performance. Focusing on recent performance—the way many investors choose their strategies and managers—is not only inadequate, it leads us in the wrong direction.
Model 2, which uses a much longer period of past performance to forecast future performance, provides a significant improvement in accuracy over Model 1, as reflected by a much smaller MSE. Still, as with Model 1, its forecasts are negatively correlated with subsequent performance, and its forecast accuracy is worse than the zero-factor-alpha Model 0.
The key takeaway in the comparison of Models 1 and 2 is that a very long history of returns, covering at least several decades, may provide a more accurate forecast of a factor’s or smart beta strategy’s return than a short-term history, but the forecast is still essentially useless. Selecting strategies or factors based on past performance, regardless of the length of the sample, will not help investors earn a superior return and is actually more likely to hurt them. The negative correlations of the forecasts of both Models 1 and 2 with subsequent factor returns imply that factors with great past performance are likely overpriced and are likely to perform poorly in the future.4