How to optimize a forecaster#

Introduction#

This method will traverse existing optimization methods(onnxruntime, openvino, jit, …) and save the model with minimum latency under the given data and search restrictions(accelerator, precision, accuracy_criterion) in forecaster.accelerated_model. This method is required to call before predict and evaluate. Now this function is only for non-distributed model.

Set up#

Before we begin, we need to install chronos if it isn’t already available, we choose to use pytorch as deep learning backend.

[ ]:

pip install --pre --upgrade bigdl-chronos[pytorch,inference]

Forecaster preparation#

Before the inferencing process, a forecaster should be created and trained. The training process is introduced in the previous guidance Train forcaster on single node in detail, therefore we directly create and train a TCNForecaster based on the nyc taxi dataset.

[1]:

# data preparation
def get_data():
    from bigdl.chronos.data import get_public_dataset
    from sklearn.preprocessing import StandardScaler

    # load the nyc taxi dataset
    tsdata_train, tsdata_val, tsdata_test = get_public_dataset(name='nyc_taxi')

    stand = StandardScaler()
    for tsdata in [tsdata_train, tsdata_val, tsdata_test]:
        tsdata.impute()\
              .scale(stand, fit=tsdata is tsdata_train)

    # convert `tsdata_train` and `tsdata_test` to pytorch dataloader
    train_data = tsdata_train.to_torch_data_loader(lookback=48, horizon=1)
    test_data = tsdata_test.to_torch_data_loader(lookback=48, horizon=1)

    return train_data, test_data

# trained forecaster preparation
def get_trained_forecaster(train_data):
    from bigdl.chronos.forecaster.tcn_forecaster import TCNForecaster
    # create a TCNForecaster
    forecaster = TCNForecaster(past_seq_len=48,
                               future_seq_len=1,
                               input_feature_num=1,
                               output_feature_num=1)

    # train the forecaster on the training data
    forecaster.fit(train_data)
    return forecaster

[2]:

# get data for training and testing
train_data, test_data = get_data()
# get a trained forecaster
forecaster = get_trained_forecaster(train_data)

Global seed set to 2187352814
Global seed set to 2187352814
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

  | Name  | Type             | Params
-------------------------------------------
0 | model | NormalizeTSModel | 4.3 K
1 | loss  | MSELoss          | 0
-------------------------------------------
4.3 K     Trainable params
0         Non-trainable params
4.3 K     Total params
0.017     Total estimated model params size (MB)

Forecaster optimization#

And there are batch_size and quantize parameters you may want to change. If not familiar with manual hyperparameters tuning, just leave batch_size to the default value.

Traverse existing optimization methods(onnxruntime, openvino, jit, …) and save the model with minimum latency under the given data and search restrictions(accelerator, precision, accuracy_criterion) in forecaster.accelerated_model.

[ ]:

forecaster.optimize(train_data, test_data, thread_num=1)

forecaster.optimize will generate an optimized model with the lowest latency.

Following blocks test the prediction time for the optimized forecaster

[10]:

import time

st = time.time()
for _ in range(100):
    forecaster.predict(test_data)
print("The optimized forecaster cost:", time.time() - st, "s")

The optimized forecaster cost: 2.5293169021606445 s

Users may set acceleration=False to drop back to the original forecaster. It’s not an usual behavior, here we use it to test the original forecaster’s prediction time.

[11]:

st = time.time()
for _ in range(100):
    forecaster.predict(test_data, acceleration=False)
print("The original forecaster cost:", time.time() - st, "s")

The original forecaster cost: 7.534037113189697 s