We study which firm characteristics drive the economic value of machine learning portfolios. Three results stand out. First, in-sample variable importance overfits and provides little reliable guidance, highlighting the need for out-of-sample evaluation using economic criteria. Second, conventional models are dominated by microcaps, which inflate returns and concentrate gains in costly-to-trade stocks; excluding microcaps is essential for meaningful inference. Third, some predictors carry negative importance and consistently degrade performance; removing them improves risk-adjusted returns and clarifies which characteristics matter. These findings show that only with economic restrictions can machine learning deliver robust asset pricing insights.
Rethinking Variable Importance in Machine Learning




