Data-Driven Models
Data-Driven Models
Empirical forecasting models · Sources
Primary papers, model variants, source notes, and review signals behind the Gradient boosting page.
Reference material used for orientation; read primary and academic sources first when claims conflict.
[S1] Reference
Friedman (2001) -- Greedy function approximation: a gradient boosting machine. Introduced functional gradient descent, pseudo-residuals, and shrinkage.
Reference
[S2] Reference
Friedman (2002) -- Stochastic gradient boosting. Added row subsampling at each round as additional regularization.
Reference
[S3] Reference
Chen and Guestrin (2016) -- XGBoost: a scalable tree boosting system. Second-order Taylor expansion, regularized objective, sparsity-aware splits, system-level optimizations.
Reference
[S4] Reference
Ke, Meng, Finley, Wang, Chen, Ma, Ye, Liu (2017) -- LightGBM: a highly efficient gradient boosting decision tree. Gradient-based one-side sampling (GOSS) and exclusive feature bundling (EFB).
Reference
[S5] Reference
Prokhorenkova, Gusev, Vorobev, Dorogush, Gulin (2018) -- CatBoost: unbiased boosting with categorical features. Ordered boosting to prevent target leakage.
Reference
[S6] Reference
Gu, Kelly, Xiu (2020) -- Empirical asset pricing via machine learning. Large-scale comparison finding gradient-boosted trees competitive with neural networks.
Reference
[S7] Reference
Medeiros, Vasconcelos, Veiga, Zilberman (2021) -- Forecasting inflation in a data-rich environment. Boosted trees outperforming linear penalized models for Brazilian macro.
Reference
Continue reading
Open the concept, data series, policy setting, or neighboring model that anchors this page.