climpred.metrics._roc

climpred.metrics._roc(forecast: xarray.Dataset, verif: xarray.Dataset, dim: Optional[Union[str, List[str]]] = None, **metric_kwargs: Any) xarray.Dataset[source]

Receiver Operating Characteristic.

Parameters
  • observations – Labeled array(s) over which to apply the function. If bin_edges=="continuous", observations are binary.

  • forecasts – Labeled array(s) over which to apply the function. If bin_edges=="continuous", forecasts are probabilities.

  • dim – The dimension(s) over which to aggregate. Defaults to None, meaning aggregation over all dims other than lead.

  • logical – Function with bool result to be applied to verification data and forecasts and then mean("member") to get forecasts and verification data in interval [0, 1]. Passed via metric_kwargs.

  • bin_edges (array_like, str) – Bin edges for categorising observations and forecasts. Similar to np.histogram, all but the last (righthand-most) bin include the left edge and exclude the right edge. The last bin includes both edges. bin_edges will be sorted in ascending order. If bin_edges=="continuous", calculate bin_edges from forecasts, equal to sklearn.metrics.roc_curve(f_boolean, o_prob). Passed via metric_kwargs. Defaults to “continuous”.

  • drop_intermediate (bool) – Whether to drop some suboptimal thresholds which would not appear on a plotted ROC curve. This is useful in order to create lighter ROC curves. Defaults to False. Defaults to True in sklearn.metrics.roc_curve. Passed via metric_kwargs.

  • return_results (str) – Passed via metric_kwargs. Defaults to “area”. Specify how return is structed:

    • “area”: return only the area under curve of ROC

    • “all_as_tuple”: return true positive rate and false positive rate at each bin and area under the curve of ROC as tuple

    • “all_as_metric_dim”: return true positive rate and false positive rate at each bin and area under curve of ROC concatinated into new metric dimension

Returns

reduced by dimensions dim, see return_results parameter. true positive rate and false positive rate contain probability_bin dimension with ascending bin_edges as coordinates.

Notes

minimum

0.0

maximum

1.0

perfect

1.0

orientation

positive

Example

>>> bin_edges = np.array([-0.5, 0.0, 0.5, 1.0])
>>> HindcastEnsemble.verify(
...     metric="roc",
...     comparison="m2o",
...     dim=["member", "init"],
...     alignment="same_verifs",
...     bin_edges=bin_edges,
... ).SST
<xarray.DataArray 'SST' (lead: 10)>
array([0.84385185, 0.82841667, 0.81358547, 0.8393463 , 0.82551752,
       0.81987778, 0.80719573, 0.80081909, 0.79046553, 0.78037564])
Coordinates:
  * lead     (lead) int32 1 2 3 4 5 6 7 8 9 10
    skill    <U11 'initialized'
Attributes:
    units:    None

Get area under the curve, false positive rate and true positive rate as metric dimension by specifying return_results="all_as_metric_dim":

>>> def f(ds):
...     return ds > 0
...
>>> HindcastEnsemble.map(f).verify(
...     metric="roc",
...     comparison="m2o",
...     dim=["member", "init"],
...     alignment="same_verifs",
...     bin_edges="continuous",
...     return_results="all_as_metric_dim",
... ).SST.isel(lead=[0, 1])
<xarray.DataArray 'SST' (lead: 2, metric: 3, probability_bin: 3)>
array([[[0.        , 0.116     , 1.        ],
        [0.        , 0.8037037 , 1.        ],
        [0.84385185, 0.84385185, 0.84385185]],

       [[0.        , 0.064     , 1.        ],
        [0.        , 0.72222222, 1.        ],
        [0.82911111, 0.82911111, 0.82911111]]])
Coordinates:
  * probability_bin  (probability_bin) float64 2.0 1.0 0.0
  * lead             (lead) int32 1 2
  * metric           (metric) <U19 'false positive rate' ... 'area under curve'
    skill            <U11 'initialized'
Attributes:
    units:    None