Open In Colab


Generate confidence interval for prediction#


In the inferencing process, sometimes user want an interval estimation for prediction instead of a point estimation, as interval estimation can provide more information to guide subsequent behaviors. One way to do this is confidence interval.

A confidence interval is the mean of your estimate plus and minus the variation in that estimate. In time series area, we adopt Monte Carlo dropout to calculate confidence interval with a reference to this paper.

Now, generating confidence interval for prediction is easy in Chronos, that is directly calling predict_interval. In this guidance, we demonstrate how to generate confidence interval for prediction of forecaster in detail.

We will take TCNForecaster and nyc_taxi dataset as an example in this guide.


Before we begin, we need to install chronos if it isn’t already available, we choose to use pytorch as deep learning backend.

[ ]:
!pip install --pre --upgrade bigdl-chronos[pytorch]
# uninstall torchtext to avoid version conflict
!pip uninstall -y torchtext

Forecaster preparation#

Before the inferencing process, a forecaster should be created and trained. The training process is introduced in the previous guidance Train forcaster on single node in detail, therefore we directly create and train a TCNForecaster based on the nyc taxi dataset.

[ ]:
# get data for training, validation, and testing
train_data, val_data, test_data = get_data()
# get a trained forecaster
forecaster = get_trained_forecaster(train_data)

Obtain confidence interval#

When a trained forecaster is ready and forecaster is a non-distributed version, we provide with predict_interval method to obtain confidence interval. Just pass data you want to predict (test data in most cases) and corresponding validation data (which will be used to calculate data bias).


validation_data is only required when calling predict_interval for the first time.

The predict_interval method supports data in following formats:

  1. numpy ndarray (recommended)

  2. pytorch dataloader


And there are batch_size and repetition_times parameters you may want to change. If not familiar with manual hyperparameters tuning, just leave batch_size to the default value. repetition_times represents repeating how many times to calculate model uncertainty based on MC Dropout. The larger the value, the more accurate the calculation, but also the slower.

[ ]:
# obtain prediction and variation by predict_interval
yhat, std = forecaster.predict_interval(data=test_data,
# obtain the upper bound and lower bound of interval according yhat and std
z_95 = 1.96 # for 95% confidence, check other quantile value of a standard Normal for other quantile
yhat_upper, yhat_lower = yhat + z_95 * std, yhat - z_95 *std