climpred.classes.HindcastEnsemble.remove_bias

climpred.classes.HindcastEnsemble.remove_bias#

HindcastEnsemble.remove_bias(alignment: str, how: str = 'additive_mean', train_test_split: str = 'unfair', train_init: DataArray | slice | None = None, train_time: DataArray | slice | None = None, cv: bool | str = False, **metric_kwargs: Any | None) HindcastEnsemble[source]#

Remove bias from HindcastEnsemble.

Bias is grouped by seasonality set via set_options. When wrapping xclim.sdba.adjustment.TrainAdjust use group instead.

Parameters:
  • alignment – which inits or verification times should be aligned?

    • ""maximize: maximize the degrees of freedom by slicing initialized and verif to a common time frame at each lead.

    • "same_inits": slice to a common init frame prior to computing metric. This philosophy follows the thought that each lead should be based on the same set of initializations.

    • "same_verif": slice to a common/consistent verification time frame prior to computing metric. This philosophy follows the thought that each lead should be based on the same set of verification dates.

  • how – what kind of bias removal to perform. Defaults to "additive_mean". Select from:

  • train_test_split – How to separate train period to calculate the bias and test period to apply bias correction to? For a detailed description, see Risbey et al. 2021:

    • "fair"`: no overlap between train and test (recommended). Set either train_init or train_time.

    • "unfair": completely overlapping train and test (default).

    • "unfair-cv"`: overlapping train and test except for current init, which is left out (set cv="LOO").

  • train_init – Define initializations for training when alignment="same_inits/maximize".

  • train_time – Define time for training when alignment="same_verif".

  • cv – Only relevant when train_test_split="unfair-cv". Defaults to False.

    • True/"LOO": Calculate bias by leaving given initialization out

      Don’t use cv="LOO", see comment.

    • False: include all initializations in the calculation of bias, which is much faster and but yields similar skill with a large N of initializations.

  • **metric_kwargs – passed to xclim.sdba (including group) or XBias_Correction

Returns:

bias removed HindcastEnsemble.

Example

Skill from raw model output without bias reduction:

>>> HindcastEnsemble.verify(
...     metric="rmse", comparison="e2o", alignment="maximize", dim="init"
... )
<xarray.Dataset>
Dimensions:  (lead: 10)
Coordinates:
  * lead     (lead) int32 1 2 3 4 5 6 7 8 9 10
    skill    <U11 'initialized'
Data variables:
    SST      (lead) float64 0.08359 0.08141 0.08362 ... 0.1361 0.1552 0.1664
Attributes:
    prediction_skill_software:     climpred https://climpred.readthedocs.io/
    skill_calculated_by_function:  HindcastEnsemble.verify()
    number_of_initializations:     64
    number_of_members:             10
    alignment:                     maximize
    metric:                        rmse
    comparison:                    e2o
    dim:                           init
    reference:                     []

Note that this HindcastEnsemble is already bias reduced, therefore train_test_split="unfair" has hardly any effect. Use all initializations to calculate bias and verify skill:

>>> HindcastEnsemble.remove_bias(
...     alignment="maximize", how="additive_mean", test_train_split="unfair"
... ).verify(
...     metric="rmse", comparison="e2o", alignment="maximize", dim="init"
... )
<xarray.Dataset>
Dimensions:  (lead: 10)
Coordinates:
  * lead     (lead) int32 1 2 3 4 5 6 7 8 9 10
    skill    <U11 'initialized'
Data variables:
    SST      (lead) float64 0.08349 0.08039 0.07522 ... 0.07305 0.08107 0.08255
Attributes:
    prediction_skill_software:     climpred https://climpred.readthedocs.io/
    skill_calculated_by_function:  HindcastEnsemble.verify()
    number_of_initializations:     64
    number_of_members:             10
    alignment:                     maximize
    metric:                        rmse
    comparison:                    e2o
    dim:                           init
    reference:                     []

Separate initializations 1954 - 1980 to calculate bias. Note that this HindcastEnsemble is already bias reduced, therefore train_test_split="fair" worsens skill here. Generally, train_test_split="fair" is recommended to use for a fair comparison against real-time forecasts.

>>> HindcastEnsemble.remove_bias(
...     alignment="maximize",
...     how="additive_mean",
...     train_test_split="fair",
...     train_init=slice("1954", "1980"),
... ).verify(
...     metric="rmse", comparison="e2o", alignment="maximize", dim="init"
... )
<xarray.Dataset>
Dimensions:  (lead: 10)
Coordinates:
  * lead     (lead) int32 1 2 3 4 5 6 7 8 9 10
    skill    <U11 'initialized'
Data variables:
    SST      (lead) float64 0.132 0.1085 0.08722 ... 0.08209 0.08969 0.08732
Attributes:
    prediction_skill_software:     climpred https://climpred.readthedocs.io/
    skill_calculated_by_function:  HindcastEnsemble.verify()
    number_of_initializations:     37
    number_of_members:             10
    alignment:                     maximize
    metric:                        rmse
    comparison:                    e2o
    dim:                           init
    reference:                     []

Wrapping methods how from xclim and providing group for groupby:

>>> HindcastEnsemble.remove_bias(
...     alignment="same_init",
...     group="init",
...     how="EmpiricalQuantileMapping",
...     train_test_split="unfair",
... ).verify(
...     metric="rmse", comparison="e2o", alignment="maximize", dim="init"
... )
<xarray.Dataset>
Dimensions:  (lead: 10)
Coordinates:
  * lead     (lead) int32 1 2 3 4 5 6 7 8 9 10
    skill    <U11 'initialized'
Data variables:
    SST      (lead) float64 0.07097 0.07402 0.06653 ... 0.05823 0.06697 0.0707
Attributes:
    prediction_skill_software:     climpred https://climpred.readthedocs.io/
    skill_calculated_by_function:  HindcastEnsemble.verify()
    number_of_initializations:     52
    number_of_members:             10
    alignment:                     maximize
    metric:                        rmse
    comparison:                    e2o
    dim:                           init
    reference:                     []

Wrapping methods how from bias_correction:

>>> HindcastEnsemble.remove_bias(
...     alignment="same_init",
...     how="modified_quantile",
...     train_test_split="unfair",
... ).verify(
...     metric="rmse", comparison="e2o", alignment="maximize", dim="init"
... )
<xarray.Dataset>
Dimensions:  (lead: 10)
Coordinates:
  * lead     (lead) int32 1 2 3 4 5 6 7 8 9 10
    skill    <U11 'initialized'
Data variables:
    SST      (lead) float64 0.07628 0.08293 0.08169 ... 0.1577 0.1821 0.2087
Attributes:
    prediction_skill_software:     climpred https://climpred.readthedocs.io/
    skill_calculated_by_function:  HindcastEnsemble.verify()
    number_of_initializations:     52
    number_of_members:             10
    alignment:                     maximize
    metric:                        rmse
    comparison:                    e2o
    dim:                           init
    reference:                     []