Skip to content

Interval Score

Interval or Winkler Score

For a prediction interval (PI), the interval or Winkler score is given by:

\(\text{IS} = \begin{cases} (u - l) + \frac{2}{\alpha}(l - y) & \text{for } y < l \\ (u - l) & \text{for } l \leq y \leq u \\ (u - l) + \frac{2}{\alpha}(y - u) & \text{for } y > u. \\ \end{cases}\)

for an \((1 - \alpha)\)PI of \([l, u]\) and the true value \(y\) 123.

Weighted Interval Score

The weighted interval score (WIS) is defined as

\(\text{WIS}_{\alpha_{0:K}}(F, y) = \frac{1}{K+0.5}(w_0 \times |y - m| + \sum_{k=1}^K (w_k \times IS_{\alpha_k}(F, y)))\)

where \(m\) denotes the median prediction, \(w_0\) denotes the weight of the median prediction, \(IS_{\alpha_k}(F, y)\) denotes the interval score for the \(1 - \alpha\) prediction interval and \(w_k\) is the according weight. The WIS is calculated for a set of (central) PIs and the predictive median 2. The weights are an optional parameter and default weight is the canonical weight \(w_k = \frac{2}{\alpha_k}\) and \(w_0 = 0.5\). For these weights, it holds that:

\(\text{WIS}_{\alpha_{0:K}}(F, y) \approx \text{CRPS}(F, y).\)

scoringrules.interval_score

interval_score(
    observations: ArrayLike,
    lower: Array,
    upper: Array,
    alpha: Union[float, Array],
    /,
    axis: int = -1,
    *,
    backend: Backend = None,
) -> Array

Compute the Interval Score or Winkler Score (Gneiting & Raftery, 2012) for 1 - \(\alpha\) prediction intervals PI = [lower, upper].

The interval score is defined as

\(\text{IS} = \begin{cases} (u - l) + \frac{2}{\alpha}(l - y) & \text{for } y < l \\ (u - l) & \text{for } l \leq y \leq u \\ (u - l) + \frac{2}{\alpha}(y - u) & \text{for } y > u. \\ \end{cases}\)

for an \(1 - \alpha\) PI of \([l, u]\) and the true value \(y\).

Note

Note that alpha can be a float or an array of coverages. In the case alpha is a float, the output will have the same shape as the observations and we assume that shape of observations, upper and lower is the same. In case alpha is a vector, the function will broadcast observations accordingly.

Parameters:

Name Type Description Default
observations ArrayLike

The observed values.

required
lower Array

The predicted lower bound of the prediction interval.

required
upper Array

The predicted upper bound of the prediction interval.

required
alpha Union[float, Array]

The 1 - alpha level for the prediction interval.

required
axis int

The axis corresponding to the ensemble. Default is the last axis.

-1
backend Backend

The name of the backend used for computations. Defaults to 'numba' if available, else 'numpy'.

None

Returns:

Name Type Description
score Array

An array of interval scores for each prediction interval, which should be averaged to get meaningful values.

scoringrules.weighted_interval_score

weighted_interval_score(
    observations: ArrayLike,
    median: Array,
    lower: Array,
    upper: Array,
    alpha: Array,
    /,
    weight_median: Optional[float] = None,
    weight_alpha: Optional[Array] = None,
    axis: int = -1,
    *,
    backend: Backend = None,
) -> Array

Compute the Interval Score or Winkler Score (Bracher J, Ray EL, Gneiting T, Reich NG, 2022) for 1 - \(\alpha\) prediction intervals PI = [lower, upper].

The weighted interval score (WIS) is defined as

\(\text{WIS}_{\alpha_{0:K}}(F, y) = \frac{1}{K+0.5}(w_0 \times |y - m| + \sum_{k=1}^K (w_k \times IS_{\alpha_k}(F, y)))\)

where \(m\) denotes the median prediction, \(w_0\) denotes the weight of the median prediction, \(IS_{\alpha_k}(F, y)\) denotes the interval score for the \(1 - \alpha\) prediction interval and \(w_k\) is the according weight. The WIS is calculated for a set of (central) PIs and the predictive median. The weights are an optional parameter and default weight is the canonical weight \(w_k = \frac{2}{\alpha_k}\) and \(w_0 = 0.5\). Using the canonical weights, the WIS can be used to approximate the CRPS.

Parameters:

Name Type Description Default
observations ArrayLike

The observed values.

required
median Array

The median prediction

required
lower Array

The predicted lower bound of the prediction interval.

required
upper Array

The predicted upper bound of the prediction interval.

required
alpha Array

The 1 - alpha level for the prediction interval.

required
weight_median Optional[float]

The weight for the median prediction.

None
weight_alpha Optional[Array]

The weights for the PI.

None
axis int

The axis corresponding to the ensemble. Default is the last axis.

-1
backend Backend

The name of the backend used for computations. Defaults to 'numba' if available, else 'numpy'.

None

Returns:

Name Type Description
score Array

An array of interval scores for each observation, which should be averaged to get meaningful values.


  1. Tilmann Gneiting and Adrian E Raftery. Strictly Proper Scoring Rules, Prediction, and Estimation. Journal of the American Statistical Association, 2007. URL: https://doi.org/10.1198/016214506000001437, doi:10.1198/016214506000001437

  2. Johannes Bracher, Evan L Ray, Tilmann Gneiting, and Nicholas G Reich. Evaluating epidemic forecasts in an interval format. PLoS computational biology, 17(2):e1008618, 2021. 

  3. Robert L Winkler. A decision-theoretic approach to interval estimation. Journal of the American Statistical Association, 67(337):187–191, 1972.