Interval Score
Interval or Winkler Score
For a prediction interval (PI), the interval or Winkler score is given by:
\(\text{IS} = \begin{cases} (u - l) + \frac{2}{\alpha}(l - y) & \text{for } y < l \\ (u - l) & \text{for } l \leq y \leq u \\ (u - l) + \frac{2}{\alpha}(y - u) & \text{for } y > u. \\ \end{cases}\)
for an \((1 - \alpha)\)PI of \([l, u]\) and the true value \(y\) 123.
Weighted Interval Score
The weighted interval score (WIS) is defined as
\(\text{WIS}_{\alpha_{0:K}}(F, y) = \frac{1}{K+0.5}(w_0 \times |y - m| + \sum_{k=1}^K (w_k \times IS_{\alpha_k}(F, y)))\)
where \(m\) denotes the median prediction, \(w_0\) denotes the weight of the median prediction, \(IS_{\alpha_k}(F, y)\) denotes the interval score for the \(1 - \alpha\) prediction interval and \(w_k\) is the according weight. The WIS is calculated for a set of (central) PIs and the predictive median 2. The weights are an optional parameter and default weight is the canonical weight \(w_k = \frac{2}{\alpha_k}\) and \(w_0 = 0.5\). For these weights, it holds that:
\(\text{WIS}_{\alpha_{0:K}}(F, y) \approx \text{CRPS}(F, y).\)
scoringrules.interval_score
interval_score(
observations: ArrayLike,
lower: Array,
upper: Array,
alpha: Union[float, Array],
/,
axis: int = -1,
*,
backend: Backend = None,
) -> Array
Compute the Interval Score or Winkler Score (Gneiting & Raftery, 2012) for 1 - \(\alpha\) prediction intervals PI = [lower, upper].
The interval score is defined as
\(\text{IS} = \begin{cases} (u - l) + \frac{2}{\alpha}(l - y) & \text{for } y < l \\ (u - l) & \text{for } l \leq y \leq u \\ (u - l) + \frac{2}{\alpha}(y - u) & \text{for } y > u. \\ \end{cases}\)
for an \(1 - \alpha\) PI of \([l, u]\) and the true value \(y\).
Note
Note that alpha can be a float or an array of coverages. In the case alpha is a float, the output will have the same shape as the observations and we assume that shape of observations, upper and lower is the same. In case alpha is a vector, the function will broadcast observations accordingly.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
observations |
ArrayLike
|
The observed values. |
required |
lower |
Array
|
The predicted lower bound of the prediction interval. |
required |
upper |
Array
|
The predicted upper bound of the prediction interval. |
required |
alpha |
Union[float, Array]
|
The 1 - alpha level for the prediction interval. |
required |
axis |
int
|
The axis corresponding to the ensemble. Default is the last axis. |
-1
|
backend |
Backend
|
The name of the backend used for computations. Defaults to 'numba' if available, else 'numpy'. |
None
|
Returns:
Name | Type | Description |
---|---|---|
score |
Array
|
An array of interval scores for each prediction interval, which should be averaged to get meaningful values. |
scoringrules.weighted_interval_score
weighted_interval_score(
observations: ArrayLike,
median: Array,
lower: Array,
upper: Array,
alpha: Array,
/,
weight_median: Optional[float] = None,
weight_alpha: Optional[Array] = None,
axis: int = -1,
*,
backend: Backend = None,
) -> Array
Compute the Interval Score or Winkler Score (Bracher J, Ray EL, Gneiting T, Reich NG, 2022) for 1 - \(\alpha\) prediction intervals PI = [lower, upper].
The weighted interval score (WIS) is defined as
\(\text{WIS}_{\alpha_{0:K}}(F, y) = \frac{1}{K+0.5}(w_0 \times |y - m| + \sum_{k=1}^K (w_k \times IS_{\alpha_k}(F, y)))\)
where \(m\) denotes the median prediction, \(w_0\) denotes the weight of the median prediction, \(IS_{\alpha_k}(F, y)\) denotes the interval score for the \(1 - \alpha\) prediction interval and \(w_k\) is the according weight. The WIS is calculated for a set of (central) PIs and the predictive median. The weights are an optional parameter and default weight is the canonical weight \(w_k = \frac{2}{\alpha_k}\) and \(w_0 = 0.5\). Using the canonical weights, the WIS can be used to approximate the CRPS.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
observations |
ArrayLike
|
The observed values. |
required |
median |
Array
|
The median prediction |
required |
lower |
Array
|
The predicted lower bound of the prediction interval. |
required |
upper |
Array
|
The predicted upper bound of the prediction interval. |
required |
alpha |
Array
|
The 1 - alpha level for the prediction interval. |
required |
weight_median |
Optional[float]
|
The weight for the median prediction. |
None
|
weight_alpha |
Optional[Array]
|
The weights for the PI. |
None
|
axis |
int
|
The axis corresponding to the ensemble. Default is the last axis. |
-1
|
backend |
Backend
|
The name of the backend used for computations. Defaults to 'numba' if available, else 'numpy'. |
None
|
Returns:
Name | Type | Description |
---|---|---|
score |
Array
|
An array of interval scores for each observation, which should be averaged to get meaningful values. |
-
Tilmann Gneiting and Adrian E Raftery. Strictly Proper Scoring Rules, Prediction, and Estimation. Journal of the American Statistical Association, 2007. URL: https://doi.org/10.1198/016214506000001437, doi:10.1198/016214506000001437. ↩
-
Johannes Bracher, Evan L Ray, Tilmann Gneiting, and Nicholas G Reich. Evaluating epidemic forecasts in an interval format. PLoS computational biology, 17(2):e1008618, 2021. ↩↩
-
Robert L Winkler. A decision-theoretic approach to interval estimation. Journal of the American Statistical Association, 67(337):187–191, 1972. ↩