Bias Removal

Climate models can have biases relative to different verification datasets. Commonly, biases are removed by postprocessing before verification of forecasting skill. climpred provides convenience functions to do so.


The select from following methods by specifying how in HindcastEnsemble.remove_bias:

  • how="additive_mean": correcting the mean forecast additively

  • how="multiplicative_mean": correcting the mean forecast multiplicatively

  • how="multiplicative_std": correcting the standard deviation multiplicatively

Wrapped from the github package bias_correction:

Wrapped from xlim.sdba:

[1]:
# linting
%load_ext nb_black
%load_ext lab_black
[2]:
import climpred
import xarray as xr
import matplotlib.pyplot as plt
from climpred import HindcastEnsemble
[3]:
init = climpred.tutorial.load_dataset("NMME_hindcast_Nino34_sst")
obs = climpred.tutorial.load_dataset("NMME_OIv2_Nino34_sst")
v = "sst"
[4]:
hindcast = HindcastEnsemble(init.sel(model="GFDL-CM2p5-FLOR-A06")).add_observations(obs)
hindcast.plot()
/Users/aaron.spring/Coding/climpred/climpred/checks.py:256: UserWarning: Did not find dimension "init", but renamed dimension S with CF-complying standard_name "forecast_reference_time" to init.
  warnings.warn(
/Users/aaron.spring/Coding/climpred/climpred/checks.py:256: UserWarning: Did not find dimension "member", but renamed dimension M with CF-complying standard_name "realization" to member.
  warnings.warn(
/Users/aaron.spring/Coding/climpred/climpred/checks.py:256: UserWarning: Did not find dimension "lead", but renamed dimension L with CF-complying standard_name "forecast_period" to lead.
  warnings.warn(
[4]:
<AxesSubplot:xlabel='time', ylabel='Sea Surface Temperature\n[Celsius_scale]'>
_images/bias_removal_5_2.png

The warming of the observations is similar to initialized.

Additive mean bias removal

Typically, bias depends on lead-time and therefore should therefore also be removed depending on lead.

[5]:
bias = hindcast.verify(
    metric="additive_bias", comparison="e2o", dim=[], alignment="same_verifs"
)

bias[v].plot()
[5]:
<matplotlib.collections.QuadMesh at 0x14e529ac0>
_images/bias_removal_9_1.png
[6]:
# group bias by seasonality
seasonality = climpred.options.OPTIONS["seasonality"]
seasonality
[6]:
'month'
[7]:
bias.groupby(f"init.{seasonality}").mean()[v].plot()
[7]:
<matplotlib.collections.QuadMesh at 0x14f097e80>
_images/bias_removal_11_1.png

An initial warm bias develops into a cold bias, especially in winter.

train_test_split

Risbey et al. 2021 demonstrate how important a clean separation of a train and a test period is for bias reduction.

Implemented train_test_splits in climpred:

  • unfair: completely overlapping train and test (climpred default)

  • unfair-cv: overlapping train and test except for current init, which is left out (set cv='LOO')

  • fair: no overlap between train and test (recommended)

[8]:
metric_kwargs = dict(
    metric="rmse", alignment="same_verifs", dim="init", comparison="e2o", skipna=True
)
[9]:
# fair calculates bias for train_time/train_init and drops these data from hindcast
hindcast.remove_bias(
    how="additive_mean",
    alignment=metric_kwargs["alignment"],
    train_test_split="fair",
    train_time=slice("1982", "1998"),
).plot()
[9]:
<AxesSubplot:xlabel='time', ylabel='Sea Surface Temperature\n[Celsius_scale]'>
_images/bias_removal_15_1.png
[10]:
hindcast.verify(**metric_kwargs)[v].plot(label="no bias correction")
train_test_split = ["unfair", "unfair-cv", "fair"]

hindcast.remove_bias(
    how="additive_mean", alignment=metric_kwargs["alignment"], train_test_split="unfair"
).verify(**metric_kwargs)[v].plot(label="unfair")

hindcast.remove_bias(
    how="additive_mean",
    alignment=metric_kwargs["alignment"],
    train_test_split="unfair-cv",
    cv="LOO",
).verify(**metric_kwargs)[v].plot(label="unfair-cv")

hindcast.remove_bias(
    how="additive_mean",
    alignment=metric_kwargs["alignment"],
    train_test_split="fair",
    train_time=slice("1982", "1998"),
).verify(**metric_kwargs)[v].plot(label="fair")


plt.legend()
plt.title(
    f"NMME {hindcast.coords['model'].values} Nino3.4 SST {metric_kwargs['metric'].upper()}"
)
[10]:
Text(0.5, 1.0, 'NMME GFDL-CM2p5-FLOR-A06 Nino3.4 SST RMSE')
_images/bias_removal_16_1.png

Comparison of methods how

[11]:
methods = [
    "additive_mean",
    # "multiplicative_std",
    "DetrendedQuantileMapping",
    "EmpiricalQuantileMapping",
    # "PrincipalComponents",
    # "LOCI",
    "QuantileDeltaMapping",
    "Scaling",
    "modified_quantile",
    "basic_quantile",
    # "gamma_mapping",
    # "normal_mapping",
]
[12]:
import warnings

warnings.simplefilter("ignore")
[13]:
# xclim.sdba requires pint units
hindcast._datasets["initialized"][v].attrs["units"] = "C"
hindcast._datasets["observations"][v].attrs["units"] = "C"
[14]:
hindcast.sel(init=slice("1983", "2018")).verify(**metric_kwargs)[v].plot(label="raw")
for method in methods:
    hindcast.remove_bias(
        how=method, alignment=metric_kwargs["alignment"], train_test_split="unfair"
    ).verify(**metric_kwargs)[v].plot(label=method)
plt.legend()
[14]:
<matplotlib.legend.Legend at 0x14ed73850>
_images/bias_removal_21_1.png
[ ]: