View the runnable example on GitHub

Accelerate PyTorch Inference using JIT/IPEX#

  • JIT: You can use InferenceOptimizer.trace(..., accelerator="jit") API to enable the TorchScript acceleration for PyTorch inference.

  • IPEX: You can use InferenceOptimizer.trace(...,use_ipex=True) API to enable the IPEX (Intel® Extension for PyTorch*) acceleration for PyTorch inference.

  • JIT + IPEX: It is recommended to use JIT and IPEX together. You can user InferenceOptimizer.trace(..., acclerator="jit", use_ipex=True) to enable both for PyTorch inference.

All of the above accelerations only take a few lines to apply.

Let’s take an ResNet-18 model pretrained on ImageNet dataset as an example. First, we load the model:

[ ]:
import torch
from torchvision.models import resnet18

model_ft = resnet18(pretrained=True)

To accelerate inference using JIT, IPEX, or JIT together with IPEX, we need to import InferenceOptimizer first:

[ ]:
from bigdl.nano.pytorch import InferenceOptimizer

Accelerate Inference Using JIT Optimizer#

[ ]:
jit_model = InferenceOptimizer.trace(model_ft,
                                     accelerator="jit",
                                     input_sample=torch.rand(1, 3, 224, 224))
with InferenceOptimizer.get_context(jit_model):
    x = torch.rand(2, 3, 224, 224)
    y_hat = jit_model(x)
    predictions = y_hat.argmax(dim=1)
    print(predictions)

Accelerate Inference Using IPEX Optimizer#

[ ]:
ipex_model = InferenceOptimizer.trace(model_ft,
                                      use_ipex=True)
with InferenceOptimizer.get_context(ipex_model):
    x = torch.rand(2, 3, 224, 224)
    y_hat = ipex_model(x)
    predictions = y_hat.argmax(dim=1)
    print(predictions)

Accelerate Inference Using JIT + IPEX#

[ ]:
jit_ipex_model = InferenceOptimizer.trace(model_ft,
                                          accelerator="jit",
                                          use_ipex=True,
                                          input_sample=torch.rand(1, 3, 224, 224))
with InferenceOptimizer.get_context(jit_ipex_model):
    x = torch.rand(2, 3, 224, 224)
    y_hat = jit_ipex_model(x)
    predictions = y_hat.argmax(dim=1)
    print(predictions)

📝 Note

input_sample is the parameter for accelerators to know the shape of the model input. So both the batch size and the specific values are not important to input_sample. If we want our test dataset to consist of images with \(224 \times 224\) pixels, we could use torch.rand(1, 3, 224, 224) for input_sample here.

Please refer to API documentation for more information on InferenceOptimizer.trace.

Also note that for all Nano optimized models by InferenceOptimizer.trace, you need to wrap the inference steps with an automatic context manager InferenceOptimizer.get_context(model=...) provided by Nano. You could refer to here for more detailed usage of the context manager.