- We find slippage between the factor returns realized by mutual fund managers and the theoretical factor returns “earned” by long–short paper portfolios over the period 1991–2016.
- The source of the slippage appears to be costs related to implementation, such as trading costs, missed trades, expenses of shorting, manager fees, stale prices, bid–ask spreads, and so forth.
- Our research shows that over the last quarter-century the real-world return for the value and market factors is halved or worse than theoretical factor returns imply, and the momentum factor has provided no benefit whatever to the end-investor.
- Our core findings of a return shortfall in real-world factor investing are supported by a series of six robustness checks.
“Why, sometimes I’ve believed as many as six impossible things before breakfast.”
—The White Queen, from Lewis Carroll’s Through the Looking Glass
This article is the first in a series of articles we will publish in 2017 that demonstrate factor tilts generally deliver far less alpha in live portfolios than they do on paper, or put another way, investment managers generally fail to capture the returns that would be expected based on their factor tilts. We break our research into four parts. In this first article we show that the factor returns realized by fund managers differ starkly from the theoretical factor returns constructed from long–short paper portfolios. Notably, the market, value, and momentum factors are far less rewarding in live fund management than their theoretical long–short paper portfolio returns.
In our next article, we will challenge the idea that factor tilts—portfolios combining several theoretical factor portfolios—are the same as smart beta strategies. We show using Fundamental Index™, equal weight, and low-volatility strategies as illustrative examples that factor tilts cannot successfully replicate smart beta strategies. Although the factor tilts of these strategies are easy to replicate, the resulting portfolios look very different from the originals, with the replication portfolios having far higher turnover, lower performance, and smaller capacity.
In a third article, we will show that the relative valuations of factor loadings can give us the courage to buy mutual funds when factor tilts are at their cheapest, hence, the most out of favor. Along with fees, turnover, and past performance—where low fees, low turnover, and low (yes, low!) past performance are predictive of better future returns—factor loadings can help us improve our forecasts of fund returns. We find the best predictor is prior three-year performance, but with the wrong sign: buying the losers is the winningest strategy.
Finally, a fourth article will take a closer look at momentum, for which we find the realized alpha in live portfolios is essentially zero compared to a theoretical alpha of around 6% a year. We show why momentum doesn’t work in live portfolios, and also show how momentum can be saved as a useful source of alpha.
In 2016, we published a series of articles that challenged the “smart beta” revolution by pointing out performance chasing in factor tilts and in smart beta strategies can be as damaging as performance chasing in other realms of asset management.1 Relative valuations are negatively correlated with subsequent returns in factors and smart beta strategies in exactly the same way we observe a value effect in stock selection and in asset allocation.
To many readers, the two most surprising revelations in our 2016 series were 1) that many factors owe much, or all, of their historical return to revaluation alpha, meaning that if the strategy has become far more expensive than in the past, its historical efficacy is exaggerated and its future efficacy may evaporate entirely; and 2) that many popular factor tilts and smart beta strategies were expensive relative to their historical norms.2 We found that the value and small-cap strategies were trading cheap relative to history, and that the momentum, gross profitability (quality), and low beta strategies were trading expensive relative to history, implying that the past returns for the former factors were understated (true efficacy was greater than it seemed) and for the latter were overstated (less powerful than they seemed).
Consequently, our findings implied that future returns for the value and small-cap factors were likely to be strong, and those for momentum, quality, and low beta were likely to be weak. This finding of weak expected performance played out in live performance far faster and far more powerfully than we could have anticipated.3 The spread, between the strategies we identified as cheapest and those we identified as most expensive, was well over 1,000 basis points (bps) in the second half of 2016.
In this article, we attempt to measure the slippage between the theoretical factor returns, derived from long–short paper portfolios, and the realized factor returns actually captured by mutual fund managers. We conduct the analysis using both US equity funds and international equity funds. Our primary focus is on US funds for which we show extensive robustness tests to quantify the impact, if any, of changes in estimation methodology or inputs on our results. We find that managers who favor high factor loadings for market beta, value, or momentum generally do not derive nearly as much incremental return relative to low beta, growth, and contrarian funds, respectively, as the factor return histories would suggest. In fact, well over half of the factor return for market beta and for value (HML) disappears, as does essentially all of the momentum factor return. We also explore the potential reasons for these impressive performance shortfalls.
Factor Returns: The Theory
Factors are used to measure manager style, to disentangle style-based performance from skill-based performance, and to build and sell quantitative investment strategies. In addition to the capital asset pricing model, or CAPM, market factor, the value, size, and momentum factors are some of the more popular factors known to academics and practitioners since at least the early 1990s. Using the most common theoretical portfolio definitions, these four factors have shown quite impressive performance: the market, value, size, and momentum factors have delivered 8.2%, 2.6%, 3.6%, and 5.7% return a year, respectively, over the last 26 years! The low beta factor (also known as the betting-against-beta, or BAB, factor) discovered in the 1970s did not garner much popularity until recently, when it delivered an eye-catching 26-year return of 10.3%.4 Other factors that have become popular over the last decade—profitability, investment, and illiquidity—also showed fabulous historical returns of 3.9%, 3.2%, and 2.1% over the past quarter-century.
Such formidable numbers might suggest factor tilts are a ready path to higher returns as well as suggesting which factors are more likely to deliver outperformance going forward, and is the theory widely advanced as fact by a vocal quant community. This theory is also a product of data mining and selection bias. While theories can help advance our understanding of a subject, they are just idealized approximations of the real world built on a foundation of core—and often wrong—simplifying assumptions. No theory can fully capture how the real world works. Worse, the real world frequently presents us with objective facts and outcomes that contradict theoretical predictions.
Factor Returns: Theory Meets Practice
What if some factor returns earned by fund managers are far smaller than the historical theoretical factor returns imply, resulting in a return shortfall in investors’ real-world portfolios? In this case, the outputs of portfolio attributions based on theoretical portfolios will be inadequate and often misleading, and the investment process that takes theoretical factor performance for granted will favor factor tilts that fail to deliver in the real world. Ultimately, the knowledge that the returns achievable in practice differ starkly from the theoretical returns should urge investors to reconsider their factor allocation choices.
In practice, the long–short portfolios used to construct factor-return time series are not investable. The return histories for these paper portfolios ignore a startling array of costs associated with real-world implementation: trading costs, missed trades, illiquid stocks, commissions, management fees, borrowing costs for the short portfolio, and the use of stocks unavailable for shorting. To this list of return shortfall sources, we might add data mining and survivorship bias. By cherry-picking some factor histories, these factors can rise to the top of the popularity roster even when selected long after—and because of—the large returns they once earned.
We can measure, albeit with some imprecision, the return slippage or return shortfall. Factor attribution assumes that the factor return flows straight through to fund returns. Our goal is to find out, month by month, how much return a factor loading delivers to mutual fund results. We can “reverse engineer” factor returns from mutual fund returns using a two-stage regression procedure. The purpose of the first-stage regression is to help identify manager factor exposure (e.g., which fund is value and which fund is growth). Once we have the estimated factor exposure for all funds, the purpose of the second stage is to measure the performance difference between funds that is attributable to their different factor loadings (e.g., between value managers and growth managers) for each unit of factor exposure. 5
An example will help make our method easier to understand. For simplicity, suppose we have return data for two mutual funds (Fund A and Fund B) over a 12-month period. We first estimate the value factor loadings for each fund using the full 12-month sample of return data and conclude that Fund A is a value fund with a value beta of 0.6 and Fund B is a growth fund with a value beta of −0.3. Next, we calculate the monthly relative return of Fund A versus Fund B for each of the 12 months. Dividing each of the 12 monthly relative-return observations by the 0.9 value beta difference between the two funds, we can infer the return earned by each as a consequence of their different factor loadings.
For any two funds, the performance difference will be due to many contributing factors, not the least of which is idiosyncratic risk. Consequently, a performance difference will be a poor measure of the value factor return. But as the universe expands to include hundreds, and then thousands, of funds, we should be able to infer with some confidence the monthly returns attributable to each unit of value factor exposure.
In a perfect world, the monthly factor returns derived from fund factor loadings, or the reverse-engineered factor returns, should very closely match the returns from the theoretical long–short portfolios used to create factors and factor-return time series. In fact, the returns derived from these two very different factor-return time series—one based on a long–short paper portfolio, and the other based on live fund returns—exhibit extremely high correlation (averaging over 90%). Month to month they track very closely. The mean returns, however, are shockingly different. Factor returns captured by mutual fund managers, especially for the factors with the largest historical long–short returns, tend to be starkly lower than their theoretical paper portfolio counterparts.
Our analysis relies on data from Morningstar Direct Mutual Fund Database for the period January 1990–December 2016. The dataset reports historical monthly total returns for all mutual funds, including ones that were liquidated or merged, ensuring our mutual fund dataset is largely survivorship-bias free. The initial fund sample includes US open-end long-only active equity funds with at least two years of return history as of December 2016. We then limit the funds in our sample to A-share, no-load, and institutional share classes.6
Our final US fund sample consists of 5,323 funds—a mixture of live funds and funds that no longer exist today. Figure 1 illustrates the evolution of the fund sample over time. Our sample size, the blue line, begins with 658 funds in 19907 (just over 392 unique funds not counting the different share classes) and gradually increases to a peak of 3,800 funds in 2008, before falling to about 3,000 funds in 2016.
The green line tracks the percentage of funds with reported returns, but without reported expense ratios. Information on fund expense ratios is not available for many funds, especially in the early part of the sample. Our main analyses use net-of-expense fund returns, which is how Morningstar Direct reports the data. For the subset of funds for which we do have expense data, we also conduct a robustness test showing results based on gross-of-expense fund returns.