Macro by Mark

Unlock Full Macro Model Library with Starter.

This feature is exclusively available to Starter, Research, and Pro. Upgrade when you need this workflow, review pricing, or send a question before changing plans.

Upgrade to Starter View pricing Questions?Already subscribed? Sign in

What you keep on Free

Create and edit one custom board
Use up to 3 widgets on each Free board
Browse indicators and calendar

← Models Overview History Concepts Models Schools

Seasonal ARIMA -- extends ARIMA with explicit seasonal AR, differencing, and MA terms for series with calendar structure.

How do you forecast a series whose autocorrelation structure repeats at a known calendar frequency?

Background

The Seasonal ARIMA model -- written SARIMA or ARIMA(p,d,q)(P,D,Q)_s -- extends Box and Jenkins's 1970 framework by introducing a second set of autoregressive and moving-average polynomials that operate at the seasonal lag s rather than at consecutive lags. Box, Jenkins, and Reinsel formalized the multiplicative seasonal structure in the 1976 second edition of Time Series Analysis, responding to the observation that many economic and physical series carry periodic patterns (monthly employment, quarterly GDP, weekly retail traffic) that ordinary ARIMA cannot capture without an impractically high lag order. The multiplicative form was the key insight: instead of fitting a single high-order ARMA polynomial, SARIMA factors the dependence structure into a non-seasonal part (operating at lags 1, 2, ..., p) and a seasonal part (operating at lags s, 2s, ..., Ps), then multiplies the two polynomials together. This dramatically reduces parameter count while preserving the model's ability to represent rich seasonal autocorrelation patterns.

The statistical mechanism works as follows. Ordinary differencing removes stochastic trends at the non-seasonal frequency; seasonal differencing -- applying the operator (1 - L^s) D times -- removes stochastic trends at the seasonal frequency. After both differencing operations, the resulting series should be covariance-stationary. The non-seasonal ARMA(p,q) component captures short-run serial dependence within each season, while the seasonal ARMA(P,Q)_s component captures year-over-year persistence and the rate at which seasonal shocks dissipate. The multiplicative interaction between the two ARMA filters generates cross-terms (e.g., at lag s+1, s-1) that model the interaction between within-season dynamics and between-season dynamics -- something an additive decomposition would miss entirely.

SARIMA is the backbone of official seasonal adjustment worldwide. The U.S. Census Bureau's X-13ARIMA-SEATS program fits a SARIMA model as its core engine: the model's residuals drive the SEATS signal-extraction filters that separate trend, seasonal, and irregular components for GDP, employment, industrial production, trade balance, and hundreds of other official series. Statistics Canada, Eurostat, the Bank of Japan, and virtually every national statistical office uses SARIMA-based seasonal adjustment. In pure forecasting, SARIMA remains the standard univariate benchmark for series with calendar patterns -- the M3 and M4 forecast competitions show that automatic SARIMA (via auto.arima or equivalent) consistently ranks in the top tier for monthly and quarterly macro series.

The most common specification in macro practice is ARIMA(0,1,1)(0,1,1)_s, known as the airline model because Box and Jenkins used it to forecast international airline passengers. This parsimonious specification -- two MA parameters, ordinary and seasonal differencing -- captures an astonishing variety of seasonal macro series. Hyndman and Khandakar's auto.arima algorithm tests the airline model as one of its first candidates. When a richer specification is needed, the algorithm searches over a grid of (p,d,q)(P,D,Q) combinations using AICc, typically capping P and Q at 1 or 2 to prevent overfitting. The seasonal period s is not estimated -- it is set by the data frequency (s=4 for quarterly, s=12 for monthly, s=52 for weekly).

How the Parts Fit Together

The input is a single time series observed at a regular frequency with a known seasonal period s. The series enters the model through two differencing stages. Ordinary differencing applies (1 - L)^d to remove non-seasonal stochastic trends. Seasonal differencing applies (1 - L^s)^D to remove stochastic seasonal trends -- persistent changes in the seasonal pattern itself, such as a gradually strengthening December retail peak. Together, these two operators transform the raw series y_t into a doubly-differenced working series w_t = (1-L)^d (1-L^s)^D y_t that should be covariance-stationary. The combined differencing order is d + D*s, so a monthly series with d=1, D=1 loses 13 observations to differencing.

The ARMA filter applied to the working series is the product of a non-seasonal and a seasonal component. The non-seasonal AR polynomial phi(L) = 1 - phi_1 L - ... - phi_p L^p captures dependence at consecutive lags (lag 1 through lag p). The seasonal AR polynomial Phi(L^s) = 1 - Phi_1 L^s - ... - Phi_P L^{Ps} captures dependence at seasonal lags (lag s through lag Ps). The full AR filter is the product phi(L) * Phi(L^s), which generates terms at lags 1, 2, ..., p, s, s+1, ..., s+p, 2s, ..., etc. The same multiplicative structure applies to the MA side: theta(L) * Theta(L^s). The multiplicative form is what distinguishes SARIMA from simply fitting a high-order ARIMA with seasonal dummies -- it imposes parsimony by factoring the lag polynomial into two low-order pieces.

Estimation uses the same maximum likelihood framework as non-seasonal ARIMA, extended to handle the multiplicative polynomial structure. The prediction error decomposition constructs one-step-ahead forecast errors by running the Kalman filter through the state-space representation of the seasonal model. The log-likelihood is the sum of Gaussian log-densities evaluated at each prediction error. Stationarity requires all roots of the combined AR polynomial to lie outside the unit circle; invertibility requires the same for the combined MA polynomial. These constraints are enforced during optimization. Order selection uses AICc over the (p,d,q,P,D,Q) grid, with d and D typically pretested using ADF/KPSS (non-seasonal) and OCSB/Canova-Hansen (seasonal) unit-root tests.

Applications

The primary use case is forecasting economic series that carry strong calendar periodicity. At the Bureau of Labor Statistics, SARIMA models underpin the seasonal adjustment of the Current Employment Statistics (CES) survey, which produces the headline nonfarm payrolls number released on the first Friday of every month. The X-13ARIMA-SEATS engine fits an ARIMA(0,1,1)(0,1,1)_12 or close variant to each of the roughly 800 CES series, extracts the seasonal component via SEATS signal-extraction filters, and produces the seasonally adjusted series that markets and policymakers watch. Without SARIMA, there is no reliable automatic seasonal adjustment pipeline for high-frequency macro data.

A secondary application is direct forecasting of seasonal time series in business and operations. Retail firms use SARIMA to forecast weekly same-store sales and plan inventory allocation by store and category. Electric utilities use SARIMA on hourly load data (with s=24 or s=168) to schedule generation and manage reserve margins. Agriculture agencies use SARIMA on monthly crop price series to project seasonal price peaks for commodity procurement planning. In each case, the model's strength is capturing a repeating pattern while allowing the pattern's amplitude and phase to be estimated from data rather than imposed by dummy variables.

SARIMA reaches its limits when the seasonal pattern itself is changing. If the December share of annual retail sales has been trending upward over 20 years, a fixed-coefficient SARIMA will underestimate the next December and overestimate the next January. The model also struggles with multiple overlapping seasonal frequencies (e.g., daily data with both weekly and annual cycles) -- fitting ARIMA(p,d,q)(P,D,Q)_365 is computationally impractical and statistically fragile. For such series, TBATS, Prophet, or Fourier-term regression provide more scalable approaches to multiple seasonality.

In the M4 forecasting competition (2018, 100,000 series), automatic SARIMA -- implemented via Hyndman's forecast package -- ranked as one of the strongest univariate statistical methods for monthly data, outperforming most pure machine-learning entries. Its accuracy-to-complexity ratio remains hard to beat: a single ARIMA(0,1,1)(0,1,1)_12 with 3 free parameters (theta_1, Theta_1, sigma^2) often matches or beats neural networks with thousands of parameters on series with stable seasonal structure.

Literature and Extensions

Key Papers

Box, Jenkins, and Reinsel (1976) -- Time Series Analysis: Forecasting and Control, 2nd ed. Formalized the multiplicative seasonal ARIMA structure.
Hillmer and Tiao (1982) -- An ARIMA-model-based approach to seasonal adjustment (Journal of the American Statistical Association). Founded the model-based seasonal adjustment methodology used in SEATS.
Gomez and Maravall (1996) -- Programs TRAMO and SEATS: Instructions for the user. The technical reference for the SEATS algorithm embedded in X-13ARIMA-SEATS.
Hyndman and Khandakar (2008) -- Automatic time series forecasting: the forecast package for R (Journal of Statistical Software). Codified automatic SARIMA fitting with stepwise AICc.
Findley, Monsell, Bell, Otto, and Chen (1998) -- New capabilities and methods of the X-12-ARIMA seasonal adjustment program (Journal of Business and Economic Statistics). Documented the integration of regARIMA with X-12 that evolved into X-13.

Named Variants

Airline model ARIMA(0,1,1)(0,1,1)_s -- the most widely deployed SARIMA specification in seasonal adjustment
RegSARIMA -- SARIMA with regression effects for trading-day variation, Easter, outliers (the regARIMA component of X-13)
Periodic ARIMA (PAR) -- allows AR and MA coefficients to vary by season within the year
SARIMA-GARCH -- pairs seasonal conditional mean with GARCH conditional variance for financial series
TBATS -- exponential smoothing with Box-Cox, ARMA errors, trend, and multiple seasonal periods (generalizes SARIMA to multiple s)

Open Questions

Whether seasonal differencing (D=1) or seasonal dummies in a regARIMA framework produces more robust seasonal adjustment is debated. Seasonal differencing assumes a unit root at the seasonal frequency; if the true DGP has deterministic seasonality, this introduces an unnecessary MA unit root.
The choice between model-based (SEATS) and filter-based (X-11) seasonal adjustment remains contested. Both use SARIMA, but SEATS derives its filters from the fitted model while X-11 applies fixed moving averages. SEATS is theoretically optimal under correct specification; X-11 is more robust to misspecification.
Handling of calendar effects (trading days, moving holidays like Easter and Chinese New Year) in SARIMA is ad hoc -- regARIMA treats them as external regressors, but the choice of regressor form affects the seasonal component and is hard to validate.

Components

y_t

Observed series

The raw time series at date t, measured at a fixed frequency with known seasonal period s.

\phi(L)

Non-seasonal AR polynomial

The polynomial 1 - \phi_1 L - \cdots - \phi_p L^p capturing short-run persistence within each season.

\Phi(L^s)

Seasonal AR polynomial

The polynomial 1 - \Phi_1 L^s - \cdots - \Phi_P L^{Ps} capturing year-over-year persistence at the seasonal frequency.

\theta(L)

Non-seasonal MA polynomial

The polynomial 1 + \theta_1 L + \cdots + \theta_q L^q capturing the within-season dissipation of past forecast errors.

\Theta(L^s)

Seasonal MA polynomial

The polynomial 1 + \Theta_1 L^s + \cdots + \Theta_Q L^{Qs} capturing the seasonal dissipation of past seasonal forecast errors.

(1-L)^d(1-L^s)^D

Combined differencing operator

Applies d ordinary differences and D seasonal differences to map the raw series to a covariance-stationary working series.

\varepsilon_t

Innovation

White-noise disturbance with E[\varepsilon_t] = 0 and Var(\varepsilon_t) = \sigma^2, serially uncorrelated.

s

Seasonal period

The number of observations per seasonal cycle: s=4 (quarterly), s=12 (monthly), s=52 (weekly). Fixed, not estimated.

Assumptions

Covariance stationarity (after both differencing operations)Testable

After d ordinary and D seasonal differences, the working series w_t = (1-L)^d (1-L^s)^D y_t has a time-invariant mean, variance, and autocovariance structure.

If violated: If the doubly-differenced series is still non-stationary, the seasonal ARMA coefficients are inconsistent. Either increase D from 0 to 1, or investigate a changing seasonal pattern that requires a time-varying approach.

Multiplicative seasonal structureMaintained

The true DGP's lag polynomial factors into a non-seasonal and a seasonal component: the full AR filter is phi(L) * Phi(L^s), not an arbitrary polynomial of order p + Ps with free coefficients at every lag.

If violated: If the true seasonal lag structure is additive rather than multiplicative, the cross-terms imposed by the product form are wrong. The model still converges to the best multiplicative approximation, but may require higher orders (p, P) than the true additive DGP would need.

Stable seasonal patternTestable

The seasonal AR and MA coefficients are constant over the estimation window. The shape and amplitude of the seasonal cycle do not change over time.

If violated: If the seasonal pattern evolves (e.g., Christmas retail share growing over decades), fixed-coefficient SARIMA misrepresents the seasonal component. Consider STL decomposition with time-varying seasonality, or ETS with multiplicative seasonality and dampening.

LinearityTestable

The conditional mean of w_t is a linear function of its own lagged values and past innovations at both seasonal and non-seasonal lags.

If violated: Nonlinear seasonal patterns (e.g., asymmetric seasonal peaks vs. troughs) are not captured. The model produces the best linear approximation, which may have high residual autocorrelation at seasonal lags.

Homoskedastic innovationsTestable

Var(\varepsilon_t | \mathcal{F}_{t-1}) = \sigma^2 for all t. Innovation variance does not vary with the season or the level of the series.

If violated: Many seasonal series have variance proportional to level (heteroskedastic seasonality). Forecast intervals are miscalibrated. A log transformation or multiplicative ETS may be appropriate.

Correct seasonal periodTestable

The seasonal period s is correctly specified. The dominant periodic component of the series aligns with the declared value of s.

If violated: Misspecified s (e.g., using s=12 for a series with dominant s=6 half-yearly cycle) creates systematic forecast errors at the mismatched frequency. Spectral analysis or periodogram inspection can detect the true dominant period.

Data-Driven Models

Loading Data-Driven Models