View the runnable example on GitHub

Accelerate PyTorch Inference using JIT/IPEX#

📝 Note

  • jit: You can use InferenceOptimizer.trace(..., accelerator="jit") API to enable the jit acceleration for PyTorch inference.

  • ipex: You can useInferenceOptimizer.trace(...,use_ipex=True) API to enable the ipex acceleration for PyTorch inference. It only takes a few lines.

  • jit + ipex: It is recommended to use JIT and IPEX together. You can user InferenceOptimizer.trace(..., acclerator="jit", use_ipex=True) to enable both for PyTorch inference.

To apply JIT/IPEX acceleration, the following dependencies need to be installed first:

[ ]:
# for BigDL-Nano
!pip install --pre --upgrade bigdl-nano[pytorch]  # install the nightly-built version
# !source bigdl-nano-init

📝 Note

We recommend to run the commands above, especially source bigdl-nano-init before jupyter kernel is started, or some of the optimizations may not take effect.

Let’s take an ResNet-18 model pretrained on ImageNet dataset as an example. First, we load the model:

[ ]:
import torch
from torchvision.models import resnet18

model_ft = resnet18(pretrained=True)

Then we set it in evaluation mode:

[ ]:
model_ft.eval()

Accelerate Inference Using JIT/IPEX/JIT+IPEX, we need import InferenceOptimizer firstly.

[ ]:
from bigdl.nano.pytorch import InferenceOptimizer

Accelerate Inference Using JIT Optimizer#

[ ]:
jit_model = InferenceOptimizer.trace(model_ft,
                                     accelerator="jit",
                                     input_sample=torch.rand(1, 3, 224, 224))
with InferenceOptimizer.get_context(jit_model):
    y_hat = jit_model(x)
    predictions = y_hat.argmax(dim=1)
    print(predictions)

Accelerate Inference Using IPEX Optimizer#

[ ]:
ipex_model = InferenceOptimizer.trace(model_ft,
                                      use_ipex=True)
with InferenceOptimizer.get_context(ipex_model):
    y_hat = ipex_model(x)
    predictions = y_hat.argmax(dim=1)
    print(predictions)

Accelerate Inference Using IPEX + JIT#

[ ]:
jit_ipex_model = InferenceOptimizer.trace(model_ft,
                                          accelerator="jit",
                                          use_ipex=True,
                                          input_sample=torch.rand(1, 3, 224, 224))
with InferenceOptimizer.get_context(jit_ipex_model):
    y_hat = jit_ipex_model(x)
    predictions = y_hat.argmax(dim=1)
    print(predictions)

📝 Note

input_sample is the parameter for OpenVINO accelerator to know the shape of the model input. So both the batch size and the specific values are not important to input_sample. If we want our test dataset to consist of images with \(224 \times 224\) pixels, we could use torch.rand(1, 3, 224, 224) for input_sample here.

Please refer to API documentation for more information on InferenceOptimizer.trace.