View the runnable example on GitHub
Accelerate PyTorch Inference using JIT/IPEX#
JIT: You can use
InferenceOptimizer.trace(..., accelerator="jit")
API to enable the TorchScript acceleration for PyTorch inference.IPEX: You can use
InferenceOptimizer.trace(...,use_ipex=True)
API to enable the IPEX (Intel® Extension for PyTorch*) acceleration for PyTorch inference.JIT + IPEX: It is recommended to use JIT and IPEX together. You can user
InferenceOptimizer.trace(..., acclerator="jit", use_ipex=True
) to enable both for PyTorch inference.
All of the above accelerations only take a few lines to apply.
Let’s take an ResNet-18 model pretrained on ImageNet dataset as an example. First, we load the model:
[ ]:
import torch
from torchvision.models import resnet18
model_ft = resnet18(pretrained=True)
To accelerate inference using JIT, IPEX, or JIT together with IPEX, we need to import InferenceOptimizer
first:
[ ]:
from bigdl.nano.pytorch import InferenceOptimizer
Accelerate Inference Using JIT Optimizer#
[ ]:
jit_model = InferenceOptimizer.trace(model_ft,
accelerator="jit",
input_sample=torch.rand(1, 3, 224, 224))
with InferenceOptimizer.get_context(jit_model):
x = torch.rand(2, 3, 224, 224)
y_hat = jit_model(x)
predictions = y_hat.argmax(dim=1)
print(predictions)
Accelerate Inference Using IPEX Optimizer#
[ ]:
ipex_model = InferenceOptimizer.trace(model_ft,
use_ipex=True)
with InferenceOptimizer.get_context(ipex_model):
x = torch.rand(2, 3, 224, 224)
y_hat = ipex_model(x)
predictions = y_hat.argmax(dim=1)
print(predictions)
Accelerate Inference Using JIT + IPEX#
[ ]:
jit_ipex_model = InferenceOptimizer.trace(model_ft,
accelerator="jit",
use_ipex=True,
input_sample=torch.rand(1, 3, 224, 224))
with InferenceOptimizer.get_context(jit_ipex_model):
x = torch.rand(2, 3, 224, 224)
y_hat = jit_ipex_model(x)
predictions = y_hat.argmax(dim=1)
print(predictions)
📝 Note
input_sample
is the parameter for accelerators to know the shape of the model input. So both the batch size and the specific values are not important toinput_sample
. If we want our test dataset to consist of images with \(224 \times 224\) pixels, we could usetorch.rand(1, 3, 224, 224)
forinput_sample
here.Please refer to API documentation for more information on
InferenceOptimizer.trace
.Also note that for all Nano optimized models by
InferenceOptimizer.trace
, you need to wrap the inference steps with an automatic context managerInferenceOptimizer.get_context(model=...)
provided by Nano. You could refer to here for more detailed usage of the context manager.