Skip to content

Interval Score

scoringrules.interval_score

interval_score(
    obs: ArrayLike,
    lower: ArrayLike,
    upper: ArrayLike,
    alpha: ArrayLike,
    *,
    backend: Backend = None
) -> Array

Compute the Interval Score or Winkler Score.

The interval score (Gneiting & Raftery, 2012) is defined as

\[ \text{IS} = \begin{cases} (u - l) + \frac{2}{\alpha}(l - y) & \text{for } y < l \\ (u - l) & \text{for } l \leq y \leq u \\ (u - l) + \frac{2}{\alpha}(y - u) & \text{for } y > u. \\ \end{cases} \]

for an \(1 - \alpha\) prediction interval of \([l, u]\) and the true value \(y\).

Parameters:

Name Type Description Default
obs ArrayLike

The observations as a scalar or array of values.

required
lower ArrayLike

The predicted lower bound of the PI as a scalar or array of values.

required
upper ArrayLike

The predicted upper bound of the PI as a scalar or array of values.

required
alpha ArrayLike

The 1 - alpha level for the PI as a scalar or array of values.

required
backend Backend

The name of the backend used for computations. Defaults to 'numba' if available, else 'numpy'.

None

Returns:

Name Type Description
score Array

Array with the interval score for the input values.

Raises:

Type Description
ValueError:

If the lower and upper bounds do not have the same shape or if the number of PIs does not match the number of alpha levels.

Notes

Given an obs array of shape (...,), in the case when multiple PIs are evaluated alpha is an array of shape (K,), then lower and upper must have shape (...,K) and the output will have shape (...,K).

Examples:

>>> import numpy as np
>>> import scoringrules as sr
>>> sr.interval_score(0.1, 0.0, 0.4, 0.5)
0.4
>>> sr.interval_score(
...     obs=np.array([0.1, 0.2, 0.3]),
...     lower=np.array([0.0, 0.1, 0.2]),
...     upper=np.array([0.4, 0.3, 0.5]),
...     alpha=0.5,
... )
array([0.4, 0.2, 0.4])
>>> sr.interval_score(
...     obs=np.random.uniform(size=(10,)),
...     lower=np.ones((10,5)) * 0.2,
...     upper=np.ones((10,5)) * 0.8,
...     alpha=np.linspace(0.1, 0.9, 5),
... ).shape
(10, 5)

scoringrules.weighted_interval_score

weighted_interval_score(
    obs: ArrayLike,
    median: Array,
    lower: Array,
    upper: Array,
    alpha: Array,
    /,
    w_median: Optional[float] = None,
    w_alpha: Optional[Array] = None,
    *,
    backend: Backend = None,
) -> Array

Compute the weighted interval score (WIS).

The WIS (Bracher et al., 2022) is defined as

\[ \text{WIS}_{\alpha_{0:K}}(F, y) = \frac{1}{K+0.5}(w_0 \times |y - m| + \sum_{k=1}^K (w_k \times IS_{\alpha_k}(F, y))) \]

where \(m\) denotes the median prediction, \(w_0\) denotes the weight of the median prediction, \(IS_{\alpha_k}(F, y)\) denotes the interval score for the \(1 - \alpha\) prediction interval and \(w_k\) is the according weight. The WIS is calculated for a set of (central) PIs and the predictive median. The weights are an optional parameter and default weight is the canonical weight \(w_k = \frac{2}{\alpha_k}\) and \(w_0 = 0.5\). For these weights, it holds that:

\[ \text{WIS}_{\alpha_{0:K}}(F, y) \approx \text{CRPS}(F, y). \]

Parameters:

Name Type Description Default
obs ArrayLike

The observations as a scalar or array of shape (...,).

required
median Array

The predicted median of the distribution as a scalar or array of shape (...,).

required
lower Array

The predicted lower bound of the PI. If alpha is an array of shape (K,), lower must have shape (...,K).

required
upper Array

The predicted upper bound of the PI. If alpha is an array of shape (K,), upper must have shape (...,K).

required
alpha Array

The 1 - alpha level for the prediction intervals as an array of shape (K,).

required
w_median Optional[float]

The weight for the median prediction. Defaults to 0.5.

None
w_alpha Optional[Array]

The weights for the PI. Defaults to 2/alpha.

None
backend Backend

The name of the backend used for computations. Defaults to 'numba' if available, else 'numpy'.

None

Returns:

Name Type Description
score Array

An array of interval scores with the same shape as obs.