Smart beta index providers offer commercial products predicated on “quality” as an independent source of return that also serves to diversify multifactor portfolios. Hsu, Kalesnik, and Kose (HKK) recognize, however, that there is no generally accepted definition of quality, and that factor index providers use a variety of measures to distinguish between high- and low-quality companies. Accordingly, their award-winning article opens with a meticulous examination of the ways in which providers implicitly define quality in constructing indices. It proceeds to evaluate the risk of data mining—a danger the authors take seriously—and to test the robustness of the providers’ operative categories in a sustained effort to determine which, if any, of the popular quality attributes are reliable sources of long-term return. The authors find that some definitional categories deliver superior performance, while others do not appear to be true factors.
The company characteristics used in the construction of six major quality indices available in the marketplace include return on equity, dividend growth, change in asset turnover, and debt-to-cash flow ratio, among others. Overall, these diverse attributes fall into seven groups: profitability, earnings stability, capital structure, growth, accounting quality, payout/dilution, and investment. Capital structure, accounting quality, and investment can be viewed as evidence of managerial conservatism. The authors implement the payout/dilution indicator in two ways, as payout (dividends plus share repurchases) and as net payout (dividends and repurchases less equity issuance).
HKK employ a three-step methodology to appraise the validity of the signals that the foregoing categories use in assigning company-specific quality scores. A variable is more likely to stand for a unique, homogeneous source of return or a single behavioral anomaly if 1) it has been extensively explored in peer-reviewed publications; 2) it exhibits hardy statistical significance across time periods and geographical regions; and 3) its statistical significance remains robust despite reasonable perturbations in its operational definition, for example, using return on assets rather than return on equity or return on invested capital as the measure of profitability. This three-step procedure offers a suite of qualitative and quantitative diagnostics to help investors mitigate potential data-mining bias and objectively evaluate particular factor strategies.
A thorough review of the literature on a purported source of excess return ensures that qualified economists have independently examined its merits and reduces the likelihood that the published results reflect methodological errors. The authors report, for instance, that, as of the time they completed their review, at least seven top-tier journals had published academic articles on the profitability characteristic. Their survey of the factor literature reveals that profitability, investment (i.e., asset growth), accounting quality, and payout/dilution are strongly related to future returns. Empirical findings on measures of capital structure are mixed, but book leverage appears to be too closely related to the low-beta anomaly to merit consideration in its own right. Similarly, earnings stability might rightfully be seen as a type of low-beta characteristic rather than a distinct factor. Historical earnings growth has found neither theoretical nor empirical support in mainstream financial publications.
Most of the published research uses long time series of US data. In order to test the robustness of quality returns across geographies, HKK examine factor performance in five regions: the US, global developed markets, Japan, Europe, and Asia Pacific excluding Japan. The robustness tests presented in their study also include definitional perturbations. Because construction methodologies that do not demonstrate strong in-sample performance are rarely published and never implemented, there is a natural upward drift in factor t-statistics. The third step, examining the statistical impact of modest methodological changes, brings this bias to light: a disproportionate change in factor performance might be a sign of data mining. HKK describe three to eight definitions they employed in implementing each of the quality factor categories. Verifying performance in multiple regions and perturbing factor definitions reduces the within-category selection bias. The authors note, however, that these steps do not address bias at the category selection level.
Using data from CRSP, Compustat, Datastream, and Worldscope, the authors constructed portfolios by region, factor definition, industry, size (large and small), and quality factor strength (high and low). They evaluated three performance measures: average portfolio return difference; average four-factor model alphas, using the classic Fama–French factors plus momentum; and the Sharpe ratio, supplemented with significance test statistics. In order to mitigate upward bias at the category level, they applied statistically appropriate haircuts to the Sharpe ratios.
HKK report the empirical results of their robustness tests in fine detail. In summary, however, they estimated that profitability, accounting quality, payout/dilution, and investment offer robust benefits on a risk-adjusted or multifactor basis. (Investment has a weaker multifactor alpha because it is correlated with the value factor.) Earnings stability, capital structure, and growth in profitability do not have empirical support as beneficial factors. These quantitative findings are consonant with the qualitative results of the literature review.
The authors’ tests do not conclude with the three-step procedure. They also apply a methodology proposed by Harvey, Liu, and Zhu, using higher t-statistic significance hurdles based on the estimated total number of backtests conducted by all researchers. They find that the four quality factor categories previously identified as robust fall well above the adjusted benchmarks, suggesting they are unlikely to have been discovered merely because of a multiple search bias. In addition, following a procedure set forth by McLean and Pontiff, they report the average returns and t-statistics of sample definitions of the four robust quality factors before and after publication in the finance literature. The authors determine that return on equity (a profitability measure), accruals (representing accounting quality), and net issuance (a payout/dilution indicator) have similar average monthly returns in the pre- and post-publication periods. Asset growth, an investment measure, has lower monthly returns in the post-publication timeframe. The authors note, however, that this factor’s publication date is 2008, resulting in a relatively short post-publication measurement period.
Finally, HKK establish that the most parsimonious quality definition is a combination of the profitability and investment signals. Adding accounting quality and payout/dilution also produces a definition that has historically provided an equity return premium. The authors observe, in closing, that the definitions they find to be robust are uniformly related to corporate governance, suggesting that quality portfolios constructed in accordance with these factor definitions might be of interest to investors who favor environmental, social, and governance (ESG) strategies.
Summarized by Philip Lawton, PhD, CFA