View the runnable example on GitHub
Save and Load Optimized IPEX Model#
This example illustrates how to save and load a model accelerated by IPEX. In this example, we use a ResNet18 model pretrained. Then, by calling InferenceOptimizer.trace(..., use_ipex=True)
, we can obtain a model accelerated by IPEX method. By calling InferenceOptimizer.save(model_name, path)
, we could save the model to a folder. By calling InferenceOptimizer.load(path)
, we could load the model from a folder.
To inference using Bigdl-nano InferenceOptimizer, the following packages need to be installed first. We recommend you to use Miniconda to prepare the environment and install the following packages in a conda environment.
You can create a conda environment by executing:
# "nano" is conda environment name, you can use any name you like.
conda create -n nano python=3.7 setuptools=58.0.4
conda activate nano
📝 Note
During your installation, there may be some warnings or errors about version, just ignore them.
[ ]:
# Necessary packages for inference accelaration
!pip install --pre --upgrade bigdl-nano[pytorch]
First, prepare model. We need load the pretrained ResNet18 model.
[ ]:
import torch
from torchvision.models import resnet18
model_ft = resnet18(pretrained=True)
model_ft.eval()
Accelerate Inference Using IPEX
[ ]:
from bigdl.nano.pytorch import InferenceOptimizer
ipex_model = InferenceOptimizer.trace(model_ft,
use_ipex=True)
Save Optimized IPEX Model The saved model files will be saved at “./optimized_model_ipex” directory There are 2 files in optimized_model_ipex, users only need to take “ckpt.pth” file for further usage:
nano_model_meta.yml: meta information of the saved model checkpoint
ckpt.pth: pytorch state dict checkpoint for general use, describes model structure
[ ]:
InferenceOptimizer.save(ipex_model, "./optimized_model_ipex")
Load the Optimized Model
📝 Note
For a model accelerated by JIT, OpenVINO or ONNXRuntime, we saved the structure of its network, so we don’t need its unaccelerated model when we load the optimized model.
For a model accelerated by IPEX, we only store the
state_dict
which is simply a python dictionary object that maps each layer to its parameter tensor when saving the model, so when we load the optimized model, we need to pass in the orginal model.
[ ]:
loaded_model = InferenceOptimizer.load("./optimized_model_ipex", model=model_ft)
Inference with the Loaded Model
[ ]:
with InferenceOptimizer.get_context(loaded_model):
x = torch.rand(2, 3, 224, 224)
y_hat = loaded_model(x)
predictions = y_hat.argmax(dim=1)
print(predictions)
📚 Related Readings