Data-Driven Models
Data-Driven Models
Empirical forecasting models · Sources
Primary papers, model variants, source notes, and review signals behind the LSTM page.
Reference material used for orientation; read primary and academic sources first when claims conflict.
[S1] Reference
Hochreiter and Schmidhuber (1997) -- Long short-term memory. Introduced the gated cell architecture solving the vanishing gradient problem.
Reference
[S2] Reference
Gers, Schmidhuber, and Cummins (2000) -- Learning to forget: continual prediction with LSTM. Added the forget gate, which was not in the original 1997 architecture.
Reference
[S3] Reference
Graves (2013) -- Generating sequences with recurrent neural networks. Extended LSTMs with deep stacking and demonstrated sequence generation capabilities.
Reference
[S4] Reference
Sutskever, Vinyals, and Le (2014) -- Sequence to sequence learning with neural networks. Established the encoder-decoder LSTM architecture for sequence-to-sequence tasks.
Reference
[S5] Reference
Gal and Ghahramani (2016) -- A theoretically grounded application of dropout in recurrent neural networks. Introduced variational dropout for LSTMs with the same mask at each time step.
Reference
[S6] Reference
Salinas, Flunkert, Gasthaus, and Januschowski (2020) -- DeepAR: probabilistic forecasting with autoregressive recurrent networks. LSTM-based probabilistic time-series forecasting at scale.
Reference
[S7] Reference
Makridakis, Spiliotis, and Assimakopoulos (2018) -- Statistical and machine learning forecasting methods: concerns and ways forward. M4 competition results showing LSTM performance relative to statistical methods.
Reference
Continue reading
Open the concept, data series, policy setting, or neighboring model that anchors this page.