Prediction Terminology

Terminology is often confusing and highly variable amongst those that make predictions in the geoscience community. Here we define some common terms in climate prediction and how we use them in climpred.

Simulation Design

Hindcast Ensemble: Ensemble members are initialized from a simulation (generally a reconstruction from reanalysis) or an analysis (representing the current state of the atmosphere, land, and ocean by assimilation of obsevations) at initialization dates and integrated for some lead years [Boer2016] (HindcastEnsemble).

Perfect Model Experiment: Ensemble members are initialized from a control simulation at randomly chosen initialization dates and integrated for some lead years [Griffies1997] (PerfectModelEnsemble).

Reconstruction/Assimilation: A “reconstruction” is a model solution that uses observations in some capacity to approximate historical or current conditions of the atmosphere, ocean, sea ice, and/or land. This could be done via a forced simulation, such as an OMIP run that uses a dynamical ocean/sea ice core with reanalysis forcing from atmospheric winds. This could also be a fully data assimilative model, which assimilates observations into the model solution. For weather, subseasonal, and seasonal predictions, the terms re-analysis and analysis are the terms typically used, while reconstruction is more commonly used for decadal predictions.

Uninitialized Ensemble: In this framework, an uninitialized ensemble is one that is generated by perturbing initial conditions only at one point in the historical run. These are generated via micro (round-off error perturbations) or macro (starting from completely different restart files) methods. Uninitialized ensembles are used to approximate the magnitude of internal climate variability and to confidently extract the forced response (ensemble mean) in the climate system. In climpred, we use uninitialized ensembles as a baseline for how important (reoccurring) initializations are for lending predictability to the system. Some modeling centers (such as NCAR) provide a dynamical uninitialized ensemble (the CESM Large Ensemble) along with their initialized prediction system (the CESM Decadal Prediction Large Ensemble). If this isn’t available, one can approximate the unintiailized response by bootstrapping a control simulation.

Forecast Assessment

Accuracy: The average degree of correspondence between individual pairs of forecasts and observations [Murphy1988]; [Jolliffe2011]. Examples include Mean Absolute Error (MAE) and Mean Square Error (MSE). See metrics.

Association: The overall strength of the relationship between individual pairs of forecasts and observations [Jolliffe2011]. The primary measure of association is the Anomaly Correlation Coefficient (ACC), which can be measured using the Pearson product-moment correlation or Spearman’s Rank correlation. See metrics.

(Potential) Predictability: This characterizes the “ability to be predicted” rather than the current “capability to predict.” One estimates this by computing a metric (like the anomaly correlation coefficient (ACC)) between the prediction ensemble and a member (or collection of members) selected as the verification member(s) (in a perfect-model setup) or the reconstruction that initialized it (in a hindcast setup) [Meehl2013] [Pegion2019].

(Prediction) Skill: This characterizes the current ability of the ensemble forecasting system to predict the real world. This is derived by computing a metric between the prediction ensemble and observations, reanalysis, or analysis of the real world [Meehl2013] [Pegion2019].

Skill Score: The most generic skill score can be defined as the following [Murphy1988]:

S = \\frac{A_{f} - A_{r}}{A_{p} - A_{r}},

where A_{f}, A_{p}, and A_{r} represent the accuracy of the forecast being assessed, the accuracy of a perfect forecast, and the accuracy of the reference forecast (e.g. persistence), respectively [Murphy1985]. Here, S represents the improvement in accuracy of the forecasts over the reference forecasts relative to the total possible improvement in accuracy. They are typically designed to take a value of 1 for a perfect forecast and 0 for equivalent to the reference forecast [Jolliffe2011].


Hindcast: Retrospective forecasts of the past initialized from a reconstruction integrated forward in time, also called re-forcasts. Depending on the length of time of the integration, external forcings may or may not be included. The longer the integration (e.g. decadal vs. daily), the more important it is to include external forcing. [Boer2016]. Because they represent so-called forecasts over periods that already occurred, their prediction skill can be evaluated.

Prediction: Forecasts initialized from a reconstruction integrated into the future. Depending on the length of time of the integration, external forcings may or may not be included. The longer the integration (e.g. decadal vs. daily), the more important it is to include external forcing. [Boer2016] Because predictions are made into the future, it is necessary to wait until the forecast occurs before one can quantify the skill of the forecast.

Projection An estimate of the future climate that is dependent on the externally forced climate response, such as anthropogenic greenhouse gases, aerosols, and volcanic eruptions [Meehl2013].



Griffies, S. M., and K. Bryan. “A Predictability Study of Simulated North Atlantic Multidecadal Variability.” Climate Dynamics 13, no. 7–8 (1997): 459–87.


Boer, G. J., Smith, D. M., Cassou, C., Doblas-Reyes, F., Danabasoglu, G., Kirtman, B., Kushnir, Y., Kimoto, M., Meehl, G. A., Msadek, R., Mueller, W. A., Taylor, K. E., Zwiers, F., Rixen, M., Ruprich-Robert, Y., and Eade, R.: The Decadal Climate Prediction Project (DCPP) contribution to CMIP6, Geosci. Model Dev., 9, 3751-3777,, 2016.


Ian T. Jolliffe and David B. Stephenson. Forecast Verification: A Practitioner’s Guide in Atmospheric Science. John Wiley & Sons, Ltd, Chichester, UK, 2011. ISBN 978-1-119-96000-3 978-0-470-66071-3. URL:


Meehl, G. A., Goddard, L., Boer, G., Burgman, R., Branstator, G., Cassou, C., … & Karspeck, A. (2014). Decadal climate prediction: an update from the trenches. Bulletin of the American Meteorological Society, 95(2), 243-267.


Murphy, Allan H., and Daan, H. “Forecast evaluation.” Probability, Statistics, and Decision Making in the Atmospheric Sciences, A. H. Murphy and R. W. Katz, Eds., Westview Press, 379-437.


Murphy, Allan H. “Skill Scores Based on the Mean Square Error and Their Relationships to the Correlation Coefficient.” Monthly Weather Review 116, no. 12 (December 1, 1988): 2417–24.


Pegion, K., T. Delsole, E. Becker, and T. Cicerone (2019). “Assessing the Fidelity of Predictability Estimates”, Climate Dynamics, 53, 7251–7265