Macro by Mark

Unlock Full Macro Model Library with Starter.

This feature is exclusively available to Starter, Research, and Pro. Upgrade when you need this workflow, review pricing, or send a question before changing plans.

Upgrade to Starter View pricing Questions?Already subscribed? Sign in

What you keep on Free

Create and edit one custom board
Use up to 3 widgets on each Free board
Browse indicators and calendar

← Models Overview History Concepts Models Schools

Autoregressive integrated moving average -- the workhorse univariate forecaster for stationary or differenced macro series.

How do you build a defensible univariate baseline forecast when the only information you trust is the series itself?

Background

The ARIMA framework was formalized by George Box and Gwilym Jenkins in their 1970 monograph Time Series Analysis: Forecasting and Control, published by Holden-Day. The core problem it solved was principled: how to take a single observed time series, remove any deterministic non-stationarity through differencing, and then fit a parsimonious linear filter that captures the remaining autocorrelation structure. The Box-Jenkins methodology introduced a disciplined three-stage cycle of identification, estimation, and diagnostic checking that became the standard workflow for applied time-series analysis for the next three decades.

The statistical mechanism is a marriage of two linear filters. The autoregressive component regresses the differenced series on its own lagged values, capturing persistence and mean reversion. The moving-average component models the influence of past forecast errors, capturing the rate at which shocks dissipate. Differencing handles stochastic trends by reducing an integrated process to a stationary one. Together, the three pieces -- autoregressive order p, integration order d, and moving-average order q -- define the ARIMA(p, d, q) model. The Wold decomposition theorem guarantees that any covariance-stationary process admits an infinite-order MA representation, so ARIMA is not an arbitrary parametric choice but a finite-order approximation to the most general linear structure a stationary series can have.

ARIMA remains the workhorse univariate baseline in central bank forecasting divisions, statistical agencies, and applied macro research. The Federal Reserve Bank of St. Louis publishes automated ARIMA baselines for hundreds of FRED series. The European Central Bank and Bank of England use ARIMA benchmarks as the reference forecast against which richer multivariate and structural models are evaluated. The IMF's World Economic Outlook process includes ARIMA cross-checks for key aggregates. In academic econometrics, ARIMA is the natural null model in forecast comparison tests -- Diebold-Mariano tests and Model Confidence Set procedures almost always include an ARIMA benchmark in the comparison pool.

The Box-Jenkins framework predates modern information criteria. Originally, identification relied on visual inspection of sample autocorrelation and partial autocorrelation functions. Akaike (1974) and Schwarz (1978) introduced AIC and BIC, which enabled automated order selection. Hyndman and Khandakar (2008) operationalized fully automatic ARIMA fitting in the forecast package for R, which remains the most widely deployed automated ARIMA implementation. Python's statsmodels and pmdarima libraries provide equivalent functionality. The auto.arima algorithm combines unit-root pretesting (KPSS, ADF), stepwise order selection via AICc, and invertibility/stationarity constraint checking into a single pipeline.

How the Parts Fit Together

The input is a single time series observed at regular intervals. The series enters the model through the differencing operator, which applies the lag operator L (defined by Ly_t = y_{t-1}) d times to produce a stationary working series. If the raw series has a unit root, first differencing (d=1) removes the stochastic trend. If the series has a second unit root -- rare in macro practice but possible for price levels -- second differencing (d=2) is applied. The differenced series is the object that the ARMA filter acts on.

The ARMA filter operates on the differenced series through two polynomial functions of the lag operator. The autoregressive polynomial maps lagged values of the series into the current-period conditional mean. The moving-average polynomial maps lagged innovations into the current-period conditional mean. The innovations are the one-step-ahead forecast errors under the model -- they are the new information arriving each period that the model could not have predicted from its own past. The conditional variance of the innovations is assumed constant (homoskedastic), which is the main structural constraint the basic ARIMA imposes.

Estimation proceeds by conditional or exact maximum likelihood. The likelihood function is constructed from the one-step-ahead prediction error decomposition: each observation contributes a Gaussian log-density evaluated at the prediction error, scaled by the conditional variance. The MLE jointly estimates the AR coefficients, the MA coefficients, and the innovation variance. Stationarity requires all roots of the AR polynomial to lie outside the unit circle; invertibility requires the same for the MA polynomial. These constraints are enforced either by reparameterization or by constrained optimization.

Applications

The primary use case is constructing a univariate forecast baseline for a single macro indicator. At the Federal Reserve Bank of Cleveland, ARIMA benchmarks for CPI components are generated monthly to detect whether multivariate models add genuine signal above the univariate floor. The ARIMA forecast serves as the null hypothesis: if a richer model cannot beat the ARIMA baseline in a pseudo-out-of-sample exercise, the added complexity is not justified. This role as a forecast benchmark makes ARIMA the most commonly estimated model in applied macroeconomic forecasting, even when it is not the final production forecast.

A secondary application is signal extraction and seasonal adjustment. The X-13ARIMA-SEATS program, maintained by the U.S. Census Bureau, fits a seasonal ARIMA model to decompose a series into trend, seasonal, and irregular components. This decomposition underpins the official seasonally adjusted figures published for GDP, employment, industrial production, and retail sales across most OECD countries. The ARIMA model inside X-13 is not a forecast tool in this context -- it is a signal-extraction filter that enables other forecasters to work with clean, deseasonalized data.

ARIMA hits a hard wall when the forecasting question involves multiple interacting variables. If the analyst cares about the joint dynamics of output, inflation, and interest rates, a univariate model for each series misses the cross-variable predictive information that a VAR would capture. ARIMA also cannot incorporate exogenous regressors in its pure form (ARIMAX extends it, but the extension is fragile and rarely preferred over a transfer function model or a small VAR). When the forecast horizon extends beyond 4-8 quarters, ARIMA forecasts converge rapidly to the unconditional mean of the differenced series, offering little value over a random walk with drift.

In real-time forecasting competitions -- the M3 and M4 competitions organized by Makridakis and colleagues -- simple ARIMA specifications routinely perform within a few percentage points of the best submissions across thousands of series. This resilience comes from the parsimony of the model: with typical macro series lengths of 100-300 observations, an ARIMA(1,1,1) has only 4 free parameters (two coefficients, a constant, and the innovation variance), leaving very little room for overfitting.

Literature and Extensions

Key Papers

Box and Jenkins (1970) -- Time Series Analysis: Forecasting and Control. The foundational monograph that codified the identification-estimation-diagnostic cycle.
Akaike (1974) -- A new look at the statistical model identification (IEEE Transactions on Automatic Control). Introduced AIC for automated order selection.
Schwarz (1978) -- Estimating the dimension of a model (Annals of Statistics). Introduced BIC, which penalizes complexity more heavily than AIC.
Hyndman and Khandakar (2008) -- Automatic time series forecasting: the forecast package for R (Journal of Statistical Software). Operationalized auto.arima with stepwise AICc selection.
Hamilton (1994) -- Time Series Analysis, Chapter 5. The standard graduate textbook treatment of ARIMA estimation and asymptotics.

Named Variants

SARIMA -- multiplicative seasonal extension with seasonal AR, differencing, and MA terms
ARIMAX / Transfer function -- ARIMA with exogenous regressors in the conditional mean
Fractional ARIMA (ARFIMA) -- replaces integer differencing with fractional d for long memory
Threshold ARIMA (TAR) -- piecewise linear ARMA with regime-dependent coefficients
ARIMA-GARCH -- ARIMA conditional mean paired with GARCH conditional variance

Open Questions

Whether AIC or BIC is preferable for macro forecast applications remains debated: AIC tends to select larger models that fit the in-sample data better but may overfit, while BIC is consistent for model selection but may underfit in finite samples.
The treatment of structural breaks in automated ARIMA pipelines is ad hoc -- most implementations estimate over a fixed rolling window or the full sample, with no formal break detection. Adaptive ARIMA with online break detection is an active research area.
The interaction between seasonal adjustment (X-13ARIMA-SEATS) and subsequent ARIMA forecasting on the adjusted series is not always handled coherently -- the two-step process can distort forecast intervals.

Components

y_t

Observed series

The raw time series at date t, measured at a fixed frequency (monthly, quarterly, annual).

\phi(L)

Autoregressive polynomial

The polynomial 1 - \phi_1 L - \cdots - \phi_p L^p whose roots must lie outside the unit circle for stationarity.

(1 - L)^d

Differencing operator

Applied d times to remove stochastic trends; maps an I(d) process to a stationary I(0) residual.

\theta(L)

Moving-average polynomial

The polynomial 1 + \theta_1 L + \cdots + \theta_q L^q whose roots must lie outside the unit circle for invertibility.

\varepsilon_t

Innovation

White-noise disturbance with E[\varepsilon_t] = 0 and Var(\varepsilon_t) = \sigma^2, serially uncorrelated by construction.

\sigma^2

Innovation variance

The constant variance of the one-step-ahead prediction error; scales forecast uncertainty bands.

L

Lag operator

Defined by L^k y_t = y_{t-k}; the algebraic device that makes polynomial manipulation of lagged series possible.

Assumptions

Covariance stationarity (after differencing)Testable

After d applications of the difference operator, the resulting series w_t = (1-L)^d y_t has a time-invariant mean, variance, and autocovariance function: E[w_t] = \mu, Var(w_t) = \gamma_0, Cov(w_t, w_{t-k}) = \gamma_k for all t.

If violated: If the differenced series is still non-stationary, AR coefficient estimates are inconsistent and forecast intervals have incorrect coverage. Apply an additional difference or consider a fractional integration model.

LinearityTestable

The conditional mean of w_t is a linear function of its own past values and past innovations. No threshold, regime-switching, or nonlinear dependence enters the conditional mean.

If violated: If the true DGP is nonlinear (e.g., SETAR, Markov-switching), ARIMA captures only the linear projection and may underperform nonlinear alternatives at medium horizons.

Homoskedastic innovationsTestable

Var(\varepsilon_t | \mathcal{F}_{t-1}) = \sigma^2 for all t. The conditional variance does not depend on past values or past shocks.

If violated: If innovations exhibit ARCH/GARCH effects, the point forecast remains consistent but forecast intervals are miscalibrated -- too narrow during volatile periods, too wide during calm periods.

No structural breaksTestable

The AR and MA coefficients and the innovation variance are constant across the estimation window. The DGP does not change regime.

If violated: A structural break inside the sample biases coefficient estimates toward a weighted average of the pre- and post-break regimes, degrading both in-sample fit and out-of-sample forecast accuracy.

Correct integration orderTestable

The order d is correctly specified: the raw series is I(d) and the differenced series is I(0). Over-differencing introduces an MA unit root; under-differencing leaves a unit root in the ARMA representation.

If violated: Over-differencing (d too large) inflates forecast variance and introduces artificial negative autocorrelation. Under-differencing (d too small) leaves a spurious trend in the forecast path.

Gaussian innovations (for exact MLE)Testable

\varepsilon_t \sim N(0, \sigma^2). Required for exact MLE inference; relaxed under quasi-MLE where only the first two moments matter.

If violated: Non-Gaussian innovations do not affect consistency of the quasi-MLE but invalidate small-sample likelihood ratio tests and exact confidence intervals. Robust standard errors or bootstrap inference may be needed.

Data-Driven Models

Loading Data-Driven Models