Skip to main content
Macro by Mark
  • Home
  • News
  • Calendar
  • Indicators
  • Macro
  • About
Sign inSign up
Macro by Mark

Global Economic Data, Empirical Models, and Macro Theory
All in One Workspace

Public data from government agencies and multilateral statistical releases, anchored in official sources

© 2026 Mark Jayson Nation

Product

  • Home
  • Indicators
  • News
  • Calendar

Macro

  • Overview
  • Models
  • Labs
  • Glossary

Learn

  • Concepts
  • Models
  • Schools
  • History
  • Docs

Account

  • Create account
  • Sign in
  • Pricing
  • Contact
AboutPrivacy PolicyTerms of ServiceTrust and securityEthics and Compliance

Data-Driven Models

Loading Data-Driven Models

Macro by Mark

Unlock Full Macro Model Library with Starter.

This feature is exclusively available to Starter, Research, and Pro. Upgrade when you need this workflow, review pricing, or send a question before changing plans.

Upgrade to StarterView pricingQuestions?Already subscribed? Sign in

What you keep on Free

  • Create and edit one custom board
  • Use up to 3 widgets on each Free board
  • Browse indicators and calendar
← ModelsOverviewHistoryConceptsModelsSchools

VAR
Model

Vector autoregression -- a small system where each macro variable is regressed on lags of itself and the other variables in the system.

How do you let a small set of macro variables forecast each other without imposing which ones cause which?

Background

Christopher Sims introduced the VAR in his 1980 Econometrica paper 'Macroeconomics and Reality' as a direct challenge to the large-scale structural macroeconometric models that dominated policy institutions in the 1970s. Those models required hundreds of a priori exclusion restrictions to achieve identification -- assumptions Sims argued were 'incredible' and untestable. The VAR sidesteps the problem: every variable is regressed on its own lags and the lags of every other variable in the system, with no restrictions on which coefficients are zero. The data decides which lags matter. Sims won the 2011 Nobel Prize in Economics largely for this contribution.

A VAR(p) with n variables stacks n regression equations. Each equation has n*p slope coefficients plus an intercept, so the full system has n^2*p + n free parameters. A 4-variable VAR(4) already has 68 coefficients. With typical macro sample sizes of 80-200 quarterly observations, this is a lot of parameters relative to data -- the 'curse of dimensionality' that defines the practical boundary of VAR analysis. Most applied work keeps n between 3 and 7 and p between 1 and 4. Beyond that, regularization (Bayesian shrinkage, Minnesota prior) or dimension reduction (factor-augmented VAR) is needed to prevent overfitting.

Three outputs define the VAR toolkit. Granger causality tests check whether lags of variable X have statistically significant predictive power for variable Y after controlling for Y's own lags. Impulse response functions (IRFs) trace how a one-standard-deviation shock to one variable propagates through the system over time. Forecast error variance decompositions (FEVDs) measure what fraction of each variable's forecast uncertainty is attributable to shocks in each other variable. Together, these tools make VAR the workhorse for empirical macro analysis of dynamic interactions among aggregate variables.

The Federal Reserve Board's staff uses VARs as one input to the Tealbook forecast. The ECB maintains a suite of VARs for the euro area. The Bank of England's forecasting platform includes VARs alongside its structural DSGE model COMPASS. In academic macro, VARs dominate the empirical identification literature: every paper estimating the effects of monetary policy, fiscal policy, or oil price shocks either uses a VAR directly or validates a structural model against VAR-identified IRFs. Stock and Watson's (2001) survey counted VARs as the single most common empirical tool in published macroeconomic research.

How the Parts Fit Together

The input is a vector of n time series y_t = (y_{1t}, y_{2t}, ..., y_{nt})' observed at the same frequency over T periods. All n variables enter every equation symmetrically -- no variable is treated as exogenous. Each variable at time t is expressed as a linear function of its own lagged values and the lagged values of all other variables, plus a contemporaneous innovation. The lag length p determines how far back the model looks; it is chosen by information criteria (AIC, BIC, HQ) or sequential likelihood ratio tests.

The reduced-form VAR does not identify structural shocks. The innovation vector u_t is a composite of all contemporaneous structural disturbances hitting the system at time t. Identifying individual structural shocks requires additional assumptions beyond the data -- Cholesky ordering (recursive structure), sign restrictions, long-run restrictions (Blanchard-Quah), external instruments (proxy SVAR), or narrative identification. The reduced-form VAR is the raw statistical object; structural identification is the interpretive layer placed on top.

Estimation is equation-by-equation OLS. Because every equation has the same set of right-hand-side variables (all lagged variables from 1 to p), OLS on each equation separately is numerically identical to GLS on the full system. This makes VAR estimation trivially fast: it is just n separate regressions. The innovation covariance matrix Sigma is estimated from the OLS residuals. All standard inference -- Granger causality F-tests, lag selection criteria, portmanteau residual tests -- follows directly from the OLS output.

Applications

The primary use case is identifying and tracing the dynamic effects of macroeconomic shocks. Christiano, Eichenbaum, and Evans (1999) used a Cholesky-identified VAR with output, prices, and the federal funds rate to estimate monetary policy transmission -- that paper defined how a generation of economists thinks about the effects of interest rate changes on GDP and inflation. Blanchard and Perotti (2002) used a structural VAR to estimate fiscal multipliers. Kilian (2009) used sign restrictions in a VAR to decompose oil price movements into demand-driven and supply-driven components. In each case, the VAR provides the empirical object (the reduced-form dynamics) and identification assumptions provide the structural interpretation.

VARs also serve as pure forecasting tools. The Federal Reserve Bank of Minneapolis maintains a Bayesian VAR (BVAR) as a benchmark forecast for GDP, inflation, and unemployment. Litterman (1986) showed that BVARs with the Minnesota prior (shrink all coefficients toward zero, with own-lags shrunk less than cross-lags) produce forecasts competitive with large structural models while requiring orders of magnitude less maintenance. The Minneapolis Fed's BVAR has been continuously updated since the 1980s, making it one of the longest-running automated macro forecast systems.

The VAR hits a hard wall at roughly n = 7-8 variables. Beyond that, the number of parameters (n^2*p) grows quadratically in n, and OLS estimates become unreliable with standard macro sample sizes. The factor-augmented VAR (FAVAR) and the large Bayesian VAR address this -- Bernanke, Boivin, and Eliasz (2005) used a FAVAR to study monetary policy with 120 macro series. But the basic unrestricted VAR remains limited to small systems.

VAR-based Granger causality testing is standard in empirical macro. Does money growth predict output? Does the yield spread predict recessions? Does oil predict inflation? Each of these is a Granger causality question answered by an F-test on cross-equation lag coefficients in a VAR. The test is purely predictive -- it says nothing about deep causation -- but it disciplines which variables belong in a forecasting model.

Literature and Extensions

Key Papers

  • Sims (1980) -- Macroeconomics and Reality (Econometrica). The founding paper. Argued for unrestricted VARs as an alternative to large structural models.
  • Litterman (1986) -- Forecasting with Bayesian Vector Autoregressions (Journal of Business and Economic Statistics). Introduced the Minnesota prior for VAR shrinkage.
  • Christiano, Eichenbaum, and Evans (1999) -- Monetary policy shocks: what have we learned and to what end? (Handbook of Macroeconomics). Canonical Cholesky-identified monetary policy VAR.
  • Stock and Watson (2001) -- Vector Autoregressions (Journal of Economic Perspectives). The standard survey article. Clear exposition of the methodology and its limits.
  • Kilian and Lutkepohl (2017) -- Structural Vector Autoregressive Analysis (Cambridge University Press). The modern reference on structural identification.

Named Variants

  • SVAR (Structural VAR) -- imposes short-run or long-run restrictions to identify structural shocks
  • BVAR (Bayesian VAR) -- applies shrinkage priors (Minnesota, Normal-Wishart) to regularize estimation in larger systems
  • FAVAR (Factor-Augmented VAR) -- replaces high-dimensional variable sets with a few estimated factors plus key policy variables
  • TVP-VAR (Time-Varying Parameter VAR) -- allows coefficients and volatility to drift over time, capturing regime changes
  • Proxy SVAR / External Instruments -- uses external variables correlated with the structural shock but uncorrelated with other shocks for identification

Open Questions

  • The identification problem remains the central challenge. No amount of data can resolve whether the contemporaneous correlation between interest rates and output reflects monetary policy affecting output or output affecting monetary policy. Every identification strategy (Cholesky, sign restrictions, external instruments) requires assumptions that are debatable.
  • Whether VARs should be estimated in levels, first differences, or with an error-correction term when variables are cointegrated is debated. Sims, Stock, and Watson (1990) argued that estimation in levels is valid even with unit roots as long as the researcher is not interested in testing hypotheses on individual coefficients.
  • The optimal amount of Bayesian shrinkage in BVARs is an active research area. The Minnesota prior was calibrated for U.S. quarterly data in the 1980s; its hyperparameters may not be optimal for other countries, frequencies, or time periods.

Components

yt\mathbf{y}_tyt​Endogenous vector

The n x 1 vector of observed variables at date t. All variables are treated symmetrically -- none is exogenous.

Ai\mathbf{A}_iAi​Coefficient matrix at lag i

The n x n matrix of coefficients on the i-th lag of y_t. Entry (j,k) of A_i measures the marginal effect of y_{k,t-i} on y_{j,t}.

c\mathbf{c}cIntercept vector

The n x 1 vector of constants. In a VAR in levels, these capture the unconditional mean of each variable.

ut\mathbf{u}_tut​Reduced-form innovation vector

The n x 1 vector of one-step prediction errors. E[u_t] = 0, E[u_t u_t'] = Sigma. Innovations can be contemporaneously correlated across equations.

Σ\boldsymbol{\Sigma}ΣInnovation covariance matrix

The n x n positive-definite matrix of contemporaneous covariances among the reduced-form innovations.

pppLag order

Number of lags of y_t included in each equation. Selected by AIC, BIC, or sequential LR tests.

Assumptions

Covariance stationarityTestable

All eigenvalues of the companion matrix (the stacked coefficient matrix in VAR(1) companion form) lie strictly inside the unit circle. The unconditional mean, variance, and autocovariance of y_t are time-invariant.

If violated: If the system has unit roots, OLS estimates are still consistent but t-statistics and F-tests have non-standard distributions. VAR in first differences or a VECM is needed.

LinearityTestable

The conditional mean of y_t is a linear function of its own lagged values. No threshold effects, regime switches, or nonlinear interactions.

If violated: If the true DGP is nonlinear (e.g., threshold VAR, Markov-switching VAR), the linear VAR captures only the linear projection and may miss asymmetric responses to large vs. small shocks.

No contemporaneous feedback in reduced formMaintained

The coefficient matrices A_1 through A_p act on lagged values only. y_t does not appear on the right-hand side. Contemporaneous interactions are absorbed into the covariance matrix Sigma.

If violated: Reduced-form VARs cannot identify instantaneous causal effects without additional structural assumptions. Misinterpreting reduced-form coefficients as causal is the most common mistake in applied VAR work.

Correct lag orderTestable

The true DGP has finite autoregressive order p or can be well-approximated by a finite-order VAR(p). The selected p is not so small that it leaves serial correlation in residuals or so large that it wastes degrees of freedom.

If violated: Under-specified p leaves autocorrelation in residuals, biasing IRFs and GC tests. Over-specified p wastes degrees of freedom and inflates coefficient standard errors, reducing forecast accuracy.

Homoskedastic innovationsTestable

E[u_t u_t' | y_{t-1}, y_{t-2}, ...] = Sigma for all t. The conditional covariance of the innovations does not depend on past values or past shocks.

If violated: Conditional heteroskedasticity (common in financial data, present in some macro series) does not affect OLS consistency but invalidates standard error estimates and prediction intervals. HAC standard errors or VAR-GARCH models are needed.

No structural breaksTestable

The coefficient matrices A_i and the covariance matrix Sigma are constant over the estimation window.

If violated: Structural breaks (e.g., the Great Moderation, changes in monetary policy regime) bias coefficient estimates toward a weighted average of the pre- and post-break regimes. Split-sample estimation or time-varying parameter VARs can help.