Tournament Comparison

Forecast tournament method notes for Macro by Mark Labs.

The tournament compares forecasts on a shared evaluation sample. It is a reporting layer for aligned predictions and actuals. It does not re-estimate the source models.

Alignment

Inputs must be aligned to the same actuals. Forecasts with incompatible target series, horizon, transform, vintage mode, or evaluation window should be excluded before ranking.

Scorecard

The scorecard reports sample size, RMSE, MAE, mean loss, rank, and relative loss to the best model. The selected loss function can be squared error, absolute error, LINEX, or tick loss.

Pairwise Tests

The lab reports Diebold-Mariano comparisons with Harvey-Leybourne-Newbold small-sample adjustment. Clark-West is used for nested MSPE comparisons. Nested non-MSPE comparisons fall back to Diebold-Mariano on the selected loss.

Giacomini-White conditional predictive ability is not reported in this release because it requires conditioning state variables.

Model Confidence Set

The model confidence set step removes models whose loss is statistically worse under the selected rule and bootstrap configuration. Inclusion means the model survived the selected procedure; it does not mean the model is true.

Limits

  • Small holdout windows can make pairwise p-values unstable.
  • A model can win a point-loss tournament and still have poor density calibration.
  • Tournaments should be run on the terminal-test window after weights are tuned on validation data.

References

  • Diebold and Mariano, 1995.
  • Harvey, Leybourne, and Newbold, 1997.
  • Clark and West, 2007.
  • Hansen, Lunde, and Nason, 2011.