Calculate ENSO Skill of NMME model NCEP-CFSv2 as Function of Initial Month vs. Lead Time¶
In this example, we demonstrate:¶
How to remotely access data from the North American Multi-model Ensemble (NMME) hindcast database and set it up to be used in
climpred
leveragingdask
.How to calculate the Anomaly Correlation Coefficient (ACC) using monthly data
How to calculate and plot historical forecast skill of the Nino3.4 index as function of initialization month and lead time.
The North American Multi-model Ensemble (NMME)¶
Further information on NMME is available from Kirtman et al. 2014 and the NMME project website
The NMME public database is hosted on the International Research Institute for Climate and Society (IRI) data server http://iridl.ldeo.columbia.edu/SOURCES/.Models/.NMME/
Since the NMME data server is accessed via this notebook, the time for the notebook to run may take a few minutes and vary depending on the speed that data is downloaded.
Definitions¶
- Anomalies
Departure from normal, where normal is defined as the climatological value based on the average value for each month over all years.
- Nino3.4
An index used to represent the evolution of the El Nino-Southern Oscillation (ENSO). Calculated as the average sea surface temperature (SST) anomalies in the region 5S-5N; 190-240
[1]:
import matplotlib.pyplot as plt
import xarray as xr
import pandas as pd
import numpy as np
from climpred import HindcastEnsemble
import climpred
Function to set 360 calendar to 360_day calendar and decond cf times
[2]:
def decode_cf(ds, time_var):
"""Decodes time dimension to CFTime standards."""
if ds[time_var].attrs['calendar'] == '360':
ds[time_var].attrs['calendar'] = '360_day'
ds = xr.decode_cf(ds, decode_times=True)
return ds
The original monthly sea surface temperature (SST) hindcast data for the NCEP-CFSv2 model from IRIDL is a large dataset (~20GB). However, we can leverage ingrid
on the IRIDL server and open server-side preprocessed data via OpenDAP
into xarray
. Averaging over longitude X
and latitude Y
and ensemble member M
reduces the download size to just a few kB.
http://iridl.ldeo.columbia.edu/dochelp/topics/DODS/fnlist.html
https://iridl.ldeo.columbia.edu/dochelp/Documentation/funcindex.html?Set-Language=en
We take the ensemble mean here, since we are just using deterministic metrics in this example. climpred
will automatically take the mean over the ensemble while evaluating metrics with comparison='e2o'
, but this should make things more efficient so it doesn’t have to be done multiple times.
[23]:
%%time
# server-side average over enso region and ensemble mean
url = 'http://iridl.ldeo.columbia.edu/SOURCES/.Models/.NMME/NCEP-CFSv2/.HINDCAST/.MONTHLY/.sst/X/190/240/RANGEEDGES/Y/-5/5/RANGEEDGES/[X Y M]average/dods'
fcstds = xr.open_dataset(url, decode_times=False)
fcstds = decode_cf(fcstds, 'S')#.compute()
fcstds
CPU times: user 10.5 ms, sys: 7.31 ms, total: 17.8 ms
Wall time: 971 ms
[23]:
<xarray.Dataset> Dimensions: (L: 10, S: 348) Coordinates: * L (L) float32 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 * S (S) object 1982-01-01 00:00:00 ... 2010-12-01 00:00:00 Data variables: sst (S, L) float64 ... Attributes: Conventions: IRIDL
- L: 10
- S: 348
- L(L)float320.5 1.5 2.5 3.5 ... 6.5 7.5 8.5 9.5
- long_name :
- Lead
- standard_name :
- forecast_period
- pointwidth :
- 1.0
- gridtype :
- 0
- units :
- months
array([0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5], dtype=float32)
- S(S)object1982-01-01 00:00:00 ... 2010-12-...
- long_name :
- Forecast Start Time
- standard_name :
- forecast_reference_time
- pointwidth :
- 0
- gridtype :
- 0
array([cftime.Datetime360Day(1982, 1, 1, 0, 0, 0, 0), cftime.Datetime360Day(1982, 2, 1, 0, 0, 0, 0), cftime.Datetime360Day(1982, 3, 1, 0, 0, 0, 0), ..., cftime.Datetime360Day(2010, 10, 1, 0, 0, 0, 0), cftime.Datetime360Day(2010, 11, 1, 0, 0, 0, 0), cftime.Datetime360Day(2010, 12, 1, 0, 0, 0, 0)], dtype=object)
- sst(S, L)float64...
- pointwidth :
- 0
- gribleveltype :
- 1
- gribcenter :
- 7
- process :
- Spectral Statistical Interpolation (SSI) analysis from "Final" run.
- grib_name :
- TMP
- GRIBgridcode :
- 3
- center :
- US Weather Service - National Met. Center
- gribparam :
- 11
- PTVersion :
- 2
- gribNumBits :
- 21
- PDS_TimeRange :
- 3
- gribfield :
- 1
- units :
- Celsius_scale
- long_name :
- Sea Surface Temperature
- standard_name :
- sea_surface_temperature
- history :
- Averaged over X[170.5W, 119.5W] Y[5.5S, 5.5N] M[0.5, 24.5]
array([[26.359797, 26.632403, 27.146464, ..., 26.792137, 26.502487, 26.508765], [26.684505, 27.176758, 27.901684, ..., 26.531667, 26.601257, 26.892236], [26.973014, 27.75665 , 28.272738, ..., 26.494105, 26.915383, 27.358508], ..., [25.602379, 25.256015, 25.404965, ..., 28.205103, 28.305424, 27.632426], [25.701525, 25.74596 , 25.821861, ..., 27.890485, 27.177615, 26.393917], [25.734744, 25.727682, 25.906104, ..., 27.177356, 26.30866 , 25.963932]])
- Conventions :
- IRIDL
The NMME data dimensions correspond to the following climpred
dimension definitions: L=lead
,M=member
, S=init
. We will rename the dimensions to their climpred
names.
[13]:
fcstds = fcstds.rename({'S': 'init',
'L': 'lead'})
Let’s make sure that the lead
dimension is set properly for climpred
. NMME data stores leads
as 0.5, 1.5, 2.5, etc, which correspond to 0, 1, 2, … months since initialization. We will change the lead
to be integers starting with zero. climpred
also requires that lead
dimension has an attribute called units
indicating what time units the lead
is assocated with. Options are: years,seasons,months,weeks,pentads,days
. For the monthly NMME data, the lead
units
are months
.
[14]:
fcstds['lead'] = (fcstds['lead'] - 0.5).astype('int')
fcstds['lead'].attrs = {'units': 'months'}
Next, we want to get the verification SST data from the data server. It’s a lot smaller, so we don’t need to worry about saving it out locally. We keep the spatial dimension as one full chunk since we’ll be averaging over the Nino3.4 box soon.
[8]:
%%time
obsurl='http://iridl.ldeo.columbia.edu/SOURCES/.Models/.NMME/.OIv2_SST/.sst/X/190/240/RANGEEDGES/Y/-5/5/RANGEEDGES/[X Y]average/dods'
verifds = xr.open_dataset(obsurl, decode_times=False)
verifds = decode_cf(verifds, 'T').compute()
verifds
[8]:
<xarray.Dataset> Dimensions: (T: 405) Coordinates: * T (T) object 1982-01-16 00:00:00 ... 2015-09-16 00:00:00 Data variables: sst (T) float64 26.81 26.78 27.26 28.06 ... 28.96 28.83 28.89 28.99 Attributes: Conventions: IRIDL
- T: 405
- T(T)object1982-01-16 00:00:00 ... 2015-09-...
- long_name :
- time
- standard_name :
- time
- pointwidth :
- 1.0
- gridtype :
- 0
array([cftime.Datetime360Day(1982, 1, 16, 0, 0, 0, 0), cftime.Datetime360Day(1982, 2, 16, 0, 0, 0, 0), cftime.Datetime360Day(1982, 3, 16, 0, 0, 0, 0), ..., cftime.Datetime360Day(2015, 7, 16, 0, 0, 0, 0), cftime.Datetime360Day(2015, 8, 16, 0, 0, 0, 0), cftime.Datetime360Day(2015, 9, 16, 0, 0, 0, 0)], dtype=object)
- sst(T)float6426.81 26.78 27.26 ... 28.89 28.99
- units :
- Celsius_scale
- standard_name :
- sea_surface_temperature
- long_name :
- OIv2 SST
- file_missing_value :
- -999.0
- history :
- Averaged over X[170.5W, 119.5W] Y[5.5S, 5.5N] minimum 0.0% data present
array([26.80636245, 26.77619304, 27.25946275, 28.05571419, 28.56691794, 28.77153006, 28.12872129, 27.95445919, 28.12440288, 28.63869933, 28.7903315 , 29.17271081, 29.31659301, 29.10298495, 29.01172086, 28.87702543, 28.88327384, 28.27826706, 27.18444041, 26.65282515, 26.54960547, 26.00825434, 25.7337301 , 25.73467051, 25.76802008, 26.47709099, 26.94330037, 27.43582137, 27.44682714, 26.94729717, 26.8152074 , 26.42340828, 26.50594514, 26.03431079, 25.55105625, 25.13158785, 25.54339705, 25.77078159, 26.31784647, 26.88128476, 27.18554378, 26.93763364, 26.7543246 , 26.57103309, 26.33720493, 26.27094423, 26.27417213, 26.19320073, 25.88859946, 26.05443525, 26.73559627, 27.49702834, 27.55883544, 27.7301475 , 27.41041604, 27.19786443, 27.3768761 , 27.61347201, 27.77248679, 27.74600588, 27.92357411, 28.03739957, 28.47619392, 28.80135489, 28.74925907, 29.00166219, 28.80106639, 28.58730101, 28.42354793, 28.12456723, 28.05122491, 27.68012059, 27.39921763, 27.28984215, 27.39055715, 27.4134766 , 26.64241489, 26.25492489, 25.70506526, 25.37085377, 25.53940319, 24.77067705, 24.42852863, 24.47320395, 24.65057381, 25.42892872, 26.00861484, 26.7682635 , 27.16020158, 27.043209 , 26.80572323, 26.3996187 , 26.33336942, 26.34862558, 26.32353249, 26.45710118, 26.6055484 , 27.00797548, 27.50882711, 28.0449938 , ... 27.59111541, 26.87787151, 26.29704444, 25.88461301, 25.35703556, 25.18927489, 25.10221523, 24.83297229, 24.96761155, 26.15062669, 26.87729193, 27.21998116, 27.21548372, 27.22375155, 26.90979516, 26.52173762, 26.42358009, 26.37789208, 25.85033375, 25.65318939, 26.11665718, 26.73950744, 27.55869893, 28.06118484, 28.14569688, 27.97532858, 27.58244179, 27.52204754, 27.69931666, 28.22365902, 28.32286693, 28.10594772, 27.97355228, 28.30179522, 28.38642687, 27.74698526, 27.08473314, 26.21420421, 25.62437796, 25.20052714, 25.11967145, 25.17304929, 25.04846216, 25.03166057, 25.54073723, 26.29518586, 27.05299904, 27.44639231, 27.48435699, 27.01205012, 26.2677132 , 26.05956093, 25.82556029, 25.70719061, 25.61535139, 25.57895048, 26.11174558, 26.70042492, 27.43224185, 27.82736755, 27.96059549, 27.77168496, 27.58106955, 27.28944215, 27.06208994, 27.08904462, 26.55783813, 26.26852622, 26.42487325, 27.0528311 , 27.72604921, 27.6386664 , 27.48780996, 26.97813311, 26.62789983, 26.73223107, 26.45502341, 26.72931205, 26.6156284 , 26.1668536 , 26.28107497, 27.05677844, 28.03573357, 28.32613421, 28.13547644, 27.46770397, 27.09496695, 27.22755388, 27.24070865, 27.55342097, 27.41339142, 27.17732809, 27.35904677, 27.83652654, 28.57704833, 28.88726554, 28.95777011, 28.82628466, 28.89171809, 28.99064635])
- Conventions :
- IRIDL
Rename the dimensions to correspond to climpred
dimensions.
[10]:
verifds = verifds.rename({'T': 'time'})
Convert the time
data to be on the first of the month and in the same calendar format as the forecast output. The time dimension is natively decoded to start in February, even though it starts in January. It is also labelled as mid-month, and we need to label it as the start of the month to ensure that the dates align properly.
[11]:
verifds['time'] = xr.cftime_range(start='1982-01', periods=verifds['time'].size, freq='MS', calendar='360_day')
Subset the data to 1982-2010
[15]:
fcstds = fcstds.sel(init=slice('1982-01-01', '2010-12-01'))
verifds = verifds.sel(time=slice('1982-01-01', '2010-12-01'))
Calculate the Nino3.4 index for forecast and verification.
[17]:
fcstclimo = fcstds.groupby('init.month').mean('init')
fcst = (fcstds.groupby('init.month') - fcstclimo)
verifclimo = verifds.groupby('time.month').mean('time')
verif = (verifds.groupby('time.month') - verifclimo)
Use the climpred.HindcastEnsemble
to calculate the anomaly correlation coefficient (ACC) as a function of initialization month and lead. We use xarray
’s .groupby()
to create sub-groups based on the month they were initialized in.
[19]:
%%time
result = []
for label, group in fcst.groupby('init.month'):
hindcast = HindcastEnsemble(group)
hindcast = hindcast.add_observations(verif)
skill = hindcast.verify(metric='acc', comparison='e2o', dim='init', alignment='maximize')
result.append(skill)
result = xr.concat(result, dim='month')
result['month'] = np.arange(12) + 1
CPU times: user 1.23 s, sys: 10.9 ms, total: 1.24 s
Wall time: 1.26 s
Plot the ACC as function of Initial Month and lead-time
[27]:
result.sst.plot(y='lead', cmap='YlOrRd',
vmin=0.0,
vmax=1.0)
plt.title('NCEP-CFSv2 Nino3.4 ACC')
plt.xlabel('Initial Month')
plt.ylabel('Lead Time (Months)')
[27]:
Text(0, 0.5, 'Lead Time (Months)')
References¶
Kirtman, B.P., D. Min, J.M. Infanti, J.L. Kinter, D.A. Paolino, Q. Zhang, H. van den Dool, S. Saha, M.P. Mendez, E. Becker, P. Peng, P. Tripp, J. Huang, D.G. DeWitt, M.K. Tippett, A.G. Barnston, S. Li, A. Rosati, S.D. Schubert, M. Rienecker, M. Suarez, Z.E. Li, J. Marshak, Y. Lim, J. Tribbia, K. Pegion, W.J. Merryfield, B. Denis, and E.F. Wood, 2014: The North American Multimodel Ensemble: Phase-1 Seasonal-to-Interannual Prediction; Phase-2 toward Developing Intraseasonal Prediction. Bull. Amer. Meteor. Soc., 95, 585–601, https://doi.org/10.1175/BAMS-D-12-00050.1