climpred.classes.HindcastEnsemble.remove_bias

climpred.classes.HindcastEnsemble.remove_bias

HindcastEnsemble.remove_bias(alignment: Optional[str] = None, how: str = 'additive_mean', train_test_split: str = 'unfair', train_init: Optional[Union[xarray.DataArray, slice]] = None, train_time: Optional[Union[xarray.DataArray, slice]] = None, cv: Union[bool, str] = False, **metric_kwargs: Optional[Any]) climpred.classes.HindcastEnsemble[source]

Remove bias from HindcastEnsemble.

Bias is grouped by seasonality set via set_options. When wrapping xclim.sdba.adjustment.TrainAdjust use group instead.

Parameters
  • alignment – which inits or verification times should be aligned?

    • ""maximize: maximize the degrees of freedom by slicing initialized and verif to a common time frame at each lead.

    • "same_inits": slice to a common init frame prior to computing metric. This philosophy follows the thought that each lead should be based on the same set of initializations.

    • "same_verif": slice to a common/consistent verification time frame prior to computing metric. This philosophy follows the thought that each lead should be based on the same set of verification dates.

  • how – what kind of bias removal to perform. Defaults to "additive_mean". Select from:

  • train_test_split – How to separate train period to calculate the bias and test period to apply bias correction to? For a detailed description, see Risbey et al. 2021:

    • "fair"`: no overlap between train and test (recommended). Set either train_init or train_time.

    • "unfair": completely overlapping train and test (default).

    • "unfair-cv"`: overlapping train and test except for current init, which is left out (set cv="LOO").

  • train_init – Define initializations for training when alignment="same_inits/maximize".

  • train_time – Define time for training when alignment="same_verif".

  • cv – Only relevant when train_test_split="unfair-cv". Defaults to False.

    • True/"LOO": Calculate bias by leaving given initialization out

    • False: include all initializations in the calculation of bias, which is much faster and but yields similar skill with a large N of initializations.

  • **metric_kwargs – passed to xclim.sdba (including group) or XBias_Correction

Returns

bias removed HindcastEnsemble.

Example

Skill from raw model output without bias reduction:

>>> HindcastEnsemble.verify(
...     metric="rmse", comparison="e2o", alignment="maximize", dim="init"
... )
<xarray.Dataset>
Dimensions:  (lead: 10)
Coordinates:
  * lead     (lead) int32 1 2 3 4 5 6 7 8 9 10
    skill    <U11 'initialized'
Data variables:
    SST      (lead) float64 0.08359 0.08141 0.08362 ... 0.1361 0.1552 0.1664
Attributes:
    prediction_skill_software:     climpred https://climpred.readthedocs.io/
    skill_calculated_by_function:  HindcastEnsemble.verify()
    number_of_initializations:     64
    number_of_members:             10
    alignment:                     maximize
    metric:                        rmse
    comparison:                    e2o
    dim:                           init
    reference:                     []

Note that this HindcastEnsemble is already bias reduced, therefore train_test_split="unfair" has hardly any effect. Use all initializations to calculate bias and verify skill:

>>> HindcastEnsemble.remove_bias(
...     alignment="maximize", how="additive_mean", test_train_split="unfair"
... ).verify(
...     metric="rmse", comparison="e2o", alignment="maximize", dim="init"
... )
<xarray.Dataset>
Dimensions:  (lead: 10)
Coordinates:
  * lead     (lead) int32 1 2 3 4 5 6 7 8 9 10
    skill    <U11 'initialized'
Data variables:
    SST      (lead) float64 0.08349 0.08039 0.07522 ... 0.07305 0.08107 0.08255
Attributes:
    prediction_skill_software:     climpred https://climpred.readthedocs.io/
    skill_calculated_by_function:  HindcastEnsemble.verify()
    number_of_initializations:     64
    number_of_members:             10
    alignment:                     maximize
    metric:                        rmse
    comparison:                    e2o
    dim:                           init
    reference:                     []

Separate initializations 1954 - 1980 to calculate bias. Note that this HindcastEnsemble is already bias reduced, therefore train_test_split="fair" worsens skill here. Generally, train_test_split="fair" is recommended to use for a fair comparison against real-time forecasts.

>>> HindcastEnsemble.remove_bias(
...     alignment="maximize",
...     how="additive_mean",
...     train_test_split="fair",
...     train_init=slice("1954", "1980"),
... ).verify(
...     metric="rmse", comparison="e2o", alignment="maximize", dim="init"
... )
<xarray.Dataset>
Dimensions:  (lead: 10)
Coordinates:
  * lead     (lead) int32 1 2 3 4 5 6 7 8 9 10
    skill    <U11 'initialized'
Data variables:
    SST      (lead) float64 0.132 0.1085 0.08722 ... 0.08209 0.08969 0.08732
Attributes:
    prediction_skill_software:     climpred https://climpred.readthedocs.io/
    skill_calculated_by_function:  HindcastEnsemble.verify()
    number_of_initializations:     37
    number_of_members:             10
    alignment:                     maximize
    metric:                        rmse
    comparison:                    e2o
    dim:                           init
    reference:                     []

Wrapping methods how from xclim and providing group for groupby:

>>> HindcastEnsemble.remove_bias(
...     alignment="same_init",
...     group="init",
...     how="DetrendedQuantileMapping",
...     train_test_split="unfair",
... ).verify(
...     metric="rmse", comparison="e2o", alignment="maximize", dim="init"
... )
<xarray.Dataset>
Dimensions:  (lead: 10)
Coordinates:
  * lead     (lead) int32 1 2 3 4 5 6 7 8 9 10
    skill    <U11 'initialized'
Data variables:
    SST      (lead) float64 0.09841 0.09758 0.08238 ... 0.0771 0.08119 0.08322
Attributes:
    prediction_skill_software:     climpred https://climpred.readthedocs.io/
    skill_calculated_by_function:  HindcastEnsemble.verify()
    number_of_initializations:     52
    number_of_members:             10
    alignment:                     maximize
    metric:                        rmse
    comparison:                    e2o
    dim:                           init
    reference:                     []

Wrapping methods how from bias_correction:

>>> HindcastEnsemble.remove_bias(
...     alignment="same_init",
...     how="modified_quantile",
...     train_test_split="unfair",
... ).verify(
...     metric="rmse", comparison="e2o", alignment="maximize", dim="init"
... )
<xarray.Dataset>
Dimensions:  (lead: 10)
Coordinates:
  * lead     (lead) int32 1 2 3 4 5 6 7 8 9 10
    skill    <U11 'initialized'
Data variables:
    SST      (lead) float64 0.07628 0.08293 0.08169 ... 0.1577 0.1821 0.2087
Attributes:
    prediction_skill_software:     climpred https://climpred.readthedocs.io/
    skill_calculated_by_function:  HindcastEnsemble.verify()
    number_of_initializations:     52
    number_of_members:             10
    alignment:                     maximize
    metric:                        rmse
    comparison:                    e2o
    dim:                           init
    reference:                     []