Prediction Terminology

Terminology is often confusing and highly variable amongst those that make predictions in the geoscience community. Here we define some common terms in climate prediction and how we use them in climpred.

Simulation Design

Hindcast Ensemble: m ensemble members are initialized from a reference simulation (generally a reconstruction from reanalysis) at n initialization dates and integrated for l lead years [Boer2016] (HindcastEnsemble).

Perfect Model Experiment: m ensemble members are initialized from a control simulation at n randomly chosen initialization dates and integrated for l lead years [Griffies1997] (PerfectModelEnsemble).

Reconstruction/Assimilation: A “reconstruction” is a model solution that uses observations in some capacity to approximate historical conditions. This could be done via a forced simulation, such as an OMIP run that uses a dynamical ocean/sea ice core with reanalysis forcing from atmospheric winds. This could also be a fully data assimilative model, which assimilates observations into the model solution.

Uninitialized Ensemble: In this framework, an uninitialized ensemble is one that is generated by perturbing initial conditions only at one point in the historical run. These are generated via micro (round-off error perturbations) or macro (starting from completely different restart files) methods. Uninitialized ensembles are used to approximate the magnitude of internal climate variability and to confidently extract the forced response (ensemble mean) in the climate system. In climpred, we use uninitialized ensembles as a baseline for how important (reoccurring) initializations are for lending predictability to the system. Some modeling centers (such as NCAR) provide a dynamical uninitialized ensemble (the CESM Large Ensemble) along with their initialized prediction system (the CESM Decadal Prediction Large Ensemble). If this isn’t available, one can approximate the unintiailized response by bootstrapping a control simulation.

Forecast Assessment

Accuracy: The average degree of correspondence between individual pairs of forecasts and observations [Murphy1988]; [Jolliffe2011]. Examples include Mean Absolute Error (MAE) and Mean Square Error (MSE). See metrics.

Association: The overall strength of the relationship between individual pairs of forecasts and observations [Jolliffe2011]. The primary measure of association is the Anomaly Correlation Coefficient (ACC), which can be measured using the Pearson product-moment correlation or Spearman’s Rank correlation. See metrics.

(Potential) Predictability: This characterizes the “ability to be predicted” rather than the current “ability to predict.” One acquires this by computing a metric (like the anomaly correlation coefficient (ACC)) between the prediction ensemble and a verification member (in a perfect-model setup) or the reconstruction that initialized it (in a hindcast setup) [Meehl2013].

(Prediction) Skill: Skill assesses the ability of the forecasting system to predict the real world, i.e. observations. It must be compared to some “standard of reference” to truly be considered skill, such as climatology of persistence [Murphy1988].

Skill Score: The most generic skill score can be defined as the following [Murphy1988]:

S = \\frac{A_{f} - A_{r}}{A_{p} - A_{r}},

where A_{f}, A_{p}, and A_{r} represent the accuracy of the forecast being assessed, the accuracy of a perfect forecast, and the accuracy of the reference forecast (e.g. persistence), respectively [Murphy1985]. Here, S represents the improvement in accuracy of the forecasts over the reference forecasts relative to the total possible improvement in accuracy. They are typically designed to take a value of 1 for a perfect forecast and 0 for equivelant to the reference forecast [Jolliffe2011].


Hindcast: Retrospective forecasts of the past initialized from a reconstruction integrated under external forcing [Boer2016].

Prediction: Forecasts initialized from a reconstruction integrated into the future with external forcing [Boer2016].

Projection An estimate of the future climate that is dependent on the externally forced climate response, such as anthropogenic greenhouse gases, aerosols, and volcanic eruptions [Meehl2013].


[Griffies1997]Griffies, S. M., and K. Bryan. “A Predictability Study of Simulated North Atlantic Multidecadal Variability.” Climate Dynamics 13, no. 7–8 (August 1, 1997): 459–87.
[Boer2016](1, 2, 3) Boer, G. J., Smith, D. M., Cassou, C., Doblas-Reyes, F., Danabasoglu, G., Kirtman, B., Kushnir, Y., Kimoto, M., Meehl, G. A., Msadek, R., Mueller, W. A., Taylor, K. E., Zwiers, F., Rixen, M., Ruprich-Robert, Y., and Eade, R.: The Decadal Climate Prediction Project (DCPP) contribution to CMIP6, Geosci. Model Dev., 9, 3751-3777,, 2016.
[Jolliffe2011](1, 2, 3) Ian T. Jolliffe and David B. Stephenson. Forecast Verification: A Practitioner’s Guide in Atmospheric Science. John Wiley & Sons, Ltd, Chichester, UK, December 2011. ISBN 978-1-119-96000-3 978-0-470-66071-3. URL:
[Meehl2013](1, 2) Meehl, G. A., Goddard, L., Boer, G., Burgman, R., Branstator, G., Cassou, C., … & Karspeck, A. (2014). Decadal climate prediction: an update from the trenches. Bulletin of the American Meteorological Society, 95(2), 243-267.
[Murphy1985]Murphy, Allan H., and Daan, H. “Forecast evaluation.” Probability, Statistics, and Decision Making in the Atmospheric Sciences, A. H. Murphy and R. W. Katz, Eds., Westview Press, 379-437.
[Murphy1988](1, 2, 3) Murphy, Allan H. “Skill Scores Based on the Mean Square Error and Their Relationships to the Correlation Coefficient.” Monthly Weather Review 116, no. 12 (December 1, 1988): 2417–24.