Temporal and Spatial Smoothing

You can run this notebook in a live session binder badge or view it on Github.

Temporal and Spatial Smoothing#

This demo demonstrates climpred’s capabilities to postprocess decadal prediction output before skill verification. Here, we showcase a set of methods to smooth out noise in the spatial and temporal domain.

# linting
%load_ext nb_black
%load_ext lab_black
%matplotlib inline
from climpred import PerfectModelEnsemble, HindcastEnsemble
from climpred.tutorial import load_dataset
import matplotlib.pylab as plt
import xarray as xr
/home/docs/checkouts/readthedocs.org/user_builds/climpred/conda/latest/lib/python3.10/site-packages/xclim/sdba.py:12: UserWarning: The `xclim.sdba` module has been split into its own package `xsdba`. Users are encouraged to use `xsdba` directly. For the time being, `xclim.sdba` will import `xsdba` to allow for API compatibility. This behaviour may change in the future. For more information, see: https://xsdba.readthedocs.io/en/stable/xclim_migration_guide.html
  warnings.warn(
/home/docs/checkouts/readthedocs.org/user_builds/climpred/conda/latest/lib/python3.10/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm
# Sea surface temperature
v = "tos"
initialized3d = load_dataset("MPI-PM-DP-3D")[v]
control3d = load_dataset("MPI-control-3D")[v]

climpred requires that lead dimension has an attribute called units indicating what time units the lead is assocated with. Options are: years, seasons, months, weeks, pentads, days, hours, minutes, seconds. For the this data, the lead units are years.

initialized3d["lead"].attrs = {"units": "years"}

Temporal smoothing#

In order to reduce temporal noise, decadal predictions are recommended to take multi-year averages [Goddard et al., 2013].

pm = PerfectModelEnsemble(initialized3d).add_control(control3d)
/home/docs/checkouts/readthedocs.org/user_builds/climpred/conda/latest/lib/python3.10/site-packages/climpred/utils.py:198: UserWarning: Assuming annual resolution starting Jan 1st due to numeric inits. Please change ``init`` to a datetime if it is another resolution. We recommend using xr.CFTimeIndex as ``init``, see https://climpred.readthedocs.io/en/stable/setting-up-data.html.
  warnings.warn(
/home/docs/checkouts/readthedocs.org/user_builds/climpred/conda/latest/lib/python3.10/site-packages/climpred/utils.py:198: UserWarning: Assuming annual resolution starting Jan 1st due to numeric inits. Please change ``init`` to a datetime if it is another resolution. We recommend using xr.CFTimeIndex as ``init``, see https://climpred.readthedocs.io/en/stable/setting-up-data.html.
  warnings.warn(

PerfectModelEnsemble.smooth() ({"time":x}) aggregates over x timesteps in time dimensions lead and time. Here it does not matter whether you specify lead or time, temporal smoothing is applied to both time dimensions. Note that the time dimension labels are not changed by this temporal smoothing immediately.

pm_tsmoothed = pm.smooth({"lead": 3})
print("initialized", pm_tsmoothed.get_initialized().coords, "\n")
print("control", pm_tsmoothed.get_control().coords)
initialized Coordinates:
    lon         (y, x) float64 451kB ...
    lat         (y, x) float64 451kB ...
  * init        (init) object 32B 3014-01-01 00:00:00 ... 3237-01-01 00:00:00
  * member      (member) int64 32B 1 2 3 4
  * lead        (lead) int64 24B 1 2 3
    valid_time  (lead, init) object 96B 3015-01-01 00:00:00 ... 3240-01-01 00... 

control Coordinates:
    lon      (y, x) float64 451kB ...
    lat      (y, x) float64 451kB ...
  * time     (time) object 384B 3000-01-01 00:00:00 ... 3047-01-01 00:00:00

But after calling PerfectModelEnsemble.verify(), the correct time aggregation label is automatically set.

pm_tsmoothed.verify(metric="rmse", comparison="m2e", dim=["init", "member"])[v].plot(
    col="lead", vmin=0, vmax=0.7, yincrease=False,
)
<xarray.plot.facetgrid.FacetGrid at 0x748789d4f700>
_images/ca3a1b5110f2fb9255409d9875be913f2496afa3721824e275bcf0d5e655cb91.png

Compare to the prediction skill without smoothing (with rmse smaller is better):

pm.verify(metric="rmse", comparison="m2e", dim=["init", "member"])[v].plot(
    col="lead", vmin=0, vmax=0.7, yincrease=False,
)
<xarray.plot.facetgrid.FacetGrid at 0x7487844db520>
_images/7563369e81e561863e775d6c9049755a539142d0fdaa639fb5a17699c8d5b638.png

You can plot time-aggregated skill timeseries by specifying x=lead_center while plotting. Lead centers are placed in the center of the lead-aggregation boundaries. Note how the correlation of forecasts is increasing for longer time-aggregations because noise is smoothed out.

pm_1d = PerfectModelEnsemble(load_dataset("MPI-PM-DP-1D"))
pm_1d = pm_1d.add_control(load_dataset("MPI-control-1D"))
pm_NA = pm_1d.sel(area="North_Atlantic_SPG", period="DJF")[['tos']]
/home/docs/checkouts/readthedocs.org/user_builds/climpred/conda/latest/lib/python3.10/site-packages/climpred/utils.py:198: UserWarning: Assuming annual resolution starting Jan 1st due to numeric inits. Please change ``init`` to a datetime if it is another resolution. We recommend using xr.CFTimeIndex as ``init``, see https://climpred.readthedocs.io/en/stable/setting-up-data.html.
  warnings.warn(
/home/docs/checkouts/readthedocs.org/user_builds/climpred/conda/latest/lib/python3.10/site-packages/climpred/utils.py:198: UserWarning: Assuming annual resolution starting Jan 1st due to numeric inits. Please change ``init`` to a datetime if it is another resolution. We recommend using xr.CFTimeIndex as ``init``, see https://climpred.readthedocs.io/en/stable/setting-up-data.html.
  warnings.warn(
for time_agg in [1, 2, 3, 4]:
    pm_NA.smooth({"lead": time_agg}).verify(
        metric="acc", comparison="m2e", dim=["init", "member"]
    ).tos.plot(x="lead_center", marker="o", label=f"{time_agg}-years smoothed")
plt.legend()
<matplotlib.legend.Legend at 0x7487848bae30>
_images/0015205359fbd11d1a3432b43925ebb7a271636859f36f703405c1a95219d1c9.png

By default, smooth preserves the original lead labels (using drop=False). Using drop=True, the lead dimension is reduced to the aggregated lead centers and the original lead coordinate is dropped.

pm_NA.smooth({"lead": 4}, how="mean", drop=True).verify(
    metric="acc", comparison="m2e", dim=["init", "member"]
).tos.coords
Coordinates:
    period       <U3 12B 'DJF'
    area         <U18 72B 'North_Atlantic_SPG'
  * lead         (lead) <U5 100B '1-4' '5-8' '9-12' '13-16' '17-20'
    lead_center  (lead) float64 40B 2.5 6.5 10.5 14.5 18.5

Now, we showcase these smoothing features for HindcastEnsemble.smooth().

v = "SST"
initialized = load_dataset("CESM-DP-SST-3D")[v]
reconstruction = load_dataset("FOSI-SST-3D")[v]
# Move reconstruction into proper anomaly space
reconstruction = reconstruction - reconstruction.sel(time=slice(1964, 2014)).mean(
    "time"
)
hindcast = HindcastEnsemble(initialized)
hindcast = hindcast.add_observations(reconstruction)
/home/docs/checkouts/readthedocs.org/user_builds/climpred/conda/latest/lib/python3.10/site-packages/climpred/utils.py:198: UserWarning: Assuming annual resolution starting Jan 1st due to numeric inits. Please change ``init`` to a datetime if it is another resolution. We recommend using xr.CFTimeIndex as ``init``, see https://climpred.readthedocs.io/en/stable/setting-up-data.html.
  warnings.warn(
/home/docs/checkouts/readthedocs.org/user_builds/climpred/conda/latest/lib/python3.10/site-packages/climpred/utils.py:198: UserWarning: Assuming annual resolution starting Jan 1st due to numeric inits. Please change ``init`` to a datetime if it is another resolution. We recommend using xr.CFTimeIndex as ``init``, see https://climpred.readthedocs.io/en/stable/setting-up-data.html.
  warnings.warn(
hindcast.smooth({"lead": 5}).verify(
    metric="rmse", comparison="e2r", dim="init", alignment="same_verif"
)[v].plot(col="lead", robust=True)
<xarray.plot.facetgrid.FacetGrid at 0x748789895c00>
_images/65db2d5f94da0edae8cd65f7786ad14b653a84e155a93972383d2a128bf704ca.png

Spatial smoothing#

In order to reduce spatial noise, global decadal predictions are recommended to get regridded to a 5 degree longitude x 5 degree latitude grid as recommended [Goddard et al., 2013].

v = "tos"
pm_ssmoothed = pm.smooth({"lon": 5, "lat": 5})
pm_ssmoothed.get_initialized().coords
/home/docs/checkouts/readthedocs.org/user_builds/climpred/conda/latest/lib/python3.10/site-packages/xesmf/backend.py:56: UserWarning: Latitude is outside of [-90, 90]
  warnings.warn('Latitude is outside of [-90, 90]')
/home/docs/checkouts/readthedocs.org/user_builds/climpred/conda/latest/lib/python3.10/site-packages/xesmf/backend.py:56: UserWarning: Latitude is outside of [-90, 90]
  warnings.warn('Latitude is outside of [-90, 90]')
Coordinates:
  * lead        (lead) int64 40B 1 2 3 4 5
  * init        (init) object 32B 3014-01-01 00:00:00 ... 3237-01-01 00:00:00
  * member      (member) int64 32B 1 2 3 4
    valid_time  (lead, init) object 160B 3015-01-01 00:00:00 ... 3242-01-01 0...
  * lat         (lat) float64 288B -83.97 -78.97 -73.97 ... 81.03 86.03 91.03
  * lon         (lon) float64 584B -180.0 -175.0 -170.0 ... 170.0 175.0 180.0
pm_ssmoothed.verify(metric="rmse", comparison="m2e", dim=["init", "member"])[v].plot(
    col="lead", robust=True
)
<xarray.plot.facetgrid.FacetGrid at 0x748784c43e50>
_images/b1cf684fbecaf03d84a1ed471730f7902b58abadf81018e2a6c4097d11dbac2f.png

Spatial smoothing guesses the names corresponding to lon and lat in the coordinates of the PredictionEnsemble underlying datasets.

hindcast.get_initialized().coords
Coordinates:
    TLAT        (nlat, nlon) float64 8kB ...
    TLONG       (nlat, nlon) float64 8kB ...
  * init        (init) object 512B 1954-01-01 00:00:00 ... 2017-01-01 00:00:00
  * lead        (lead) int32 40B 1 2 3 4 5 6 7 8 9 10
    TAREA       (nlat, nlon) float64 8kB ...
    valid_time  (lead, init) object 5kB 1955-01-01 00:00:00 ... 2027-01-01 00...
hindcast.smooth({"lon": 1, "lat": 1}).get_initialized().coords
Coordinates:
  * init        (init) object 512B 1954-01-01 00:00:00 ... 2017-01-01 00:00:00
  * lead        (lead) int32 40B 1 2 3 4 5 6 7 8 9 10
    valid_time  (lead, init) object 5kB 1955-01-01 00:00:00 ... 2027-01-01 00...
  * lat         (lat) float64 88B -9.75 -8.75 -7.75 ... -1.75 -0.7503 0.2497
  * lon         (lon) float64 240B 250.8 251.8 252.8 253.8 ... 277.8 278.8 279.8

HindcastEnsemble.smooth() ("goddard2013") creates 4-year means and 5x5 degree regridding as suggested in [Goddard et al., 2013].

pm.smooth("goddard2013").verify(
    metric="acc", comparison="m2e", dim=["init", "member"]
).coords
/home/docs/checkouts/readthedocs.org/user_builds/climpred/conda/latest/lib/python3.10/site-packages/xesmf/backend.py:56: UserWarning: Latitude is outside of [-90, 90]
  warnings.warn('Latitude is outside of [-90, 90]')
/home/docs/checkouts/readthedocs.org/user_builds/climpred/conda/latest/lib/python3.10/site-packages/xesmf/backend.py:56: UserWarning: Latitude is outside of [-90, 90]
  warnings.warn('Latitude is outside of [-90, 90]')
Coordinates:
  * lead         (lead) <U3 24B '1-4' '2-5'
  * lat          (lat) float64 288B -83.97 -78.97 -73.97 ... 81.03 86.03 91.03
  * lon          (lon) float64 584B -180.0 -175.0 -170.0 ... 170.0 175.0 180.0
    lead_center  (lead) float64 16B 2.5 3.5

References#

[1] (1,2,3)

L. Goddard, A. Kumar, A. Solomon, D. Smith, G. Boer, P. Gonzalez, V. Kharin, W. Merryfield, C. Deser, S. J. Mason, B. P. Kirtman, R. Msadek, R. Sutton, E. Hawkins, T. Fricker, G. Hegerl, C. a. T. Ferro, D. B. Stephenson, G. A. Meehl, T. Stockdale, R. Burgman, A. M. Greene, Y. Kushnir, M. Newman, J. Carton, I. Fukumori, and T. Delworth. A verification framework for interannual-to-decadal predictions experiments. Climate Dynamics, 40(1-2):245–272, January 2013. doi:10/f4jjvf.