Data-Driven Models
Data-Driven Models
Empirical forecasting models · Model guide
Autoregressive distributed lag model for short-run dynamics and long-run relationships among variables that may have mixed integration...
Is there a long-run equilibrium relationship between these variables when I am not sure whether they are I(0), I(1), or a mix?
Cointegration analysis traditionally requires that all variables be integrated of the same order. The Johansen (1991) procedure assumes a pure I(1) system; the Engle-Granger (1987) two-step method requires the same. In applied macro, this precondition is rarely met cleanly. Unit-root tests have limited power, some series appear borderline I(0)/I(1), and pre-testing for integration order introduces its own specification risk. Pesaran, Shin, and Smith (2001) - hereafter PSS - proposed the bounds-testing approach: estimate an ARDL model in error-correction form and test for the significance of the lagged-level terms. The test provides two sets of critical values - one assuming all regressors are I(0), one assuming all are I(1) - and the sample F-statistic is compared to both. If it exceeds the upper bound, cointegration is confirmed regardless of integration order. If it falls below the lower bound, no cointegration. If it falls between the bounds, the test is inconclusive.
The ARDL model itself is old - Hendry and von Ungern-Sternberg (1981) used distributed-lag specifications throughout the 1970s and 1980s. What PSS contributed was an inferential framework that works in the I(0)/I(1) gray zone. The model reparameterizes the standard into a conditional error-correction form where the long-run coefficients appear explicitly as ratios of lagged-level coefficients and the short-run dynamics appear as differenced terms.
Estimation is straightforward OLS after selecting the lag orders (p, , ..., by information criteria. The bounds test is an F-test on the joint significance of the lagged levels of and in the error-correction equation. Pesaran and Shin (1999) showed that the ARDL approach produces consistent estimates of the long-run parameters even when the regressors are endogenous, provided the lag orders are correctly specified.
The ARDL bounds test is the workhorse cointegration test in applied development economics, energy economics, and agricultural economics. Google Scholar returns over 80,000 citations to PSS (2001), making it one of the most cited econometrics papers ever. It is the default approach in EViews, Stata's ardl command, and the R package dynamac.
The model takes a single dependent variable and k regressors , ..., . Each regressor enters with its own lag order , selected by AIC, BIC, or similar criteria. The model includes p lags of and up to lags of each x_jt. Deterministic components (intercept, restricted trend, unrestricted trend) are included depending on the case (PSS tabulated five cases).
The error-correction reparameterization rewrites the ARDL as: Delta alpha_0 + pi_y * sum(pi_j * short-run differenced terms + . The coefficient pi_y is the speed of adjustment: it must be negative for the ECM to be stable. The long-run coefficients are recovered as beta_j = -pi_j / pi_y. The bounds test is an F-test for : pi_y = pi_1 = ... = pi_k = 0 (no levels relationship).
Output includes: (1) the estimated long-run coefficients beta_j with standard errors (the delta method or Bewley transform), (2) the short-run coefficients on the differenced terms, (3) the error-correction speed pi_y, (4) the bounds F-statistic and its position relative to the I(0) and I(1) critical value bounds, and (5) diagnostic tests (serial correlation, heteroskedasticity, functional form, normality).
The World Bank and IMF use ARDL bounds testing to study long-run determinants of economic growth, financial development, and energy consumption in developing countries where time series are short (T = 30-60) and integration orders are uncertain. The single-equation framework is attractive because it avoids the Johansen procedure's requirement for a well-specified VAR with sufficient degrees of freedom.
Energy economists use the ARDL to test the Environmental Kuznets Curve (EKC) hypothesis: whether pollution first rises and then falls with per-capita income. The long-run coefficient on income-squared must be negative and significant. The bounds test confirms whether a genuine long-run relationship exists or whether the quadratic pattern is spurious.
The ARDL cannot handle more than one cointegrating vector in a single equation. When the system has multiple long-run relationships (e.g., a money-demand equation and a Phillips curve that share variables), the Johansen VECM is needed. The ARDL also requires that no variable is I(2), which rules out nominal price series in hyperinflationary episodes. With very short samples (T < 30), the bounds F-statistic has low power and the critical values from PSS (2001) may not apply - Narayan (2005) tabulated small-sample critical values.
Extensions include the nonlinear ARDL (NARDL) of Shin, Yu, and Greenwood-Nimmo (2014), which decomposes regressors into positive and negative partial sums to test for asymmetric long-run and short-run effects. The quantile ARDL (QARDL) extends the framework to quantile regression. Panel ARDL estimators (Pesaran, Shin, and Smith 1999) handle cross-country heterogeneity through mean-group and pooled mean-group estimators.
Coefficient on the lagged level of . Must be negative for stability. Its absolute value measures how fast disequilibrium is corrected per period.
Coefficients on in the error-correction form. The long-run multiplier for is beta_j = -pi_j / pi_y.
The equilibrium effect of a permanent one-unit change in on y. Derived from the ratio of lagged-level coefficients.
Differenced terms capturing the transient response of y to changes in y and x. Lag orders selected by information criteria.
Joint F-test for : pi_y = pi_1 = ... = pi_k = 0. Compared to lower (I(0)) and upper (I(1)) critical value bounds.
Deviation from the estimated long-run equilibrium. When ECT_t > 0, y is above equilibrium and pi_y pulls it back down.
All variables are I(0) or I(1). The bounds test is invalid for I(2) processes.
If violated: I(2) variables make the asymptotic theory invalid. The bounds critical values no longer bracket the true distribution.
The ARDL lag orders (p, , ..., are chosen to eliminate serial correlation in the residuals.
If violated: Under-specified lags produce serially correlated residuals, biasing the bounds F-statistic upward and inflating the cointegration rejection rate.
There is at most one long-run relationship with as the dependent variable.
If violated: If multiple cointegrating vectors exist, the ARDL identifies only the one normalized on y. The others are ignored, potentially biasing the estimated long-run coefficients.
The regressors x_jt do not respond to the error-correction term in the long run (or the lag orders are sufficient to proxy any endogeneity).
If violated: If x_jt is strongly endogenous and lag orders are insufficient, the OLS estimates are inconsistent. Pesaran and Shin (1999) showed that sufficient lags restore consistency.
The long-run and short-run parameters are constant over the sample.
If violated: Structural breaks can create spurious cointegration results. CUSUM and CUSUMSQ tests are standard post-estimation diagnostics.
is i.i.d. with zero mean and finite variance.
If violated: Heteroskedasticity does not bias the point estimates but invalidates the standard bounds critical values. Use Kripfganz-Schneider (2020) corrected critical values.
Continue reading
Open the concept, data series, policy setting, or neighboring model that anchors this page.