Save and Load ONNXRuntime Model#
This example illustrates how to save and load a model accelerated by ONNXRuntime.
In this example, we use a pretrained ResNet18 model. Then, by calling
trace(..., accelerator="onnxruntime"), we can obtain a model accelarated by onnxruntime method provided by BigDL-Nano for inference. By calling
save(model=..., path=...) , we could save the Nano optimized model to a folder. By calling
load(path=...), we could load the ONNXRuntime optimized model from a folder.
First, prepare model. We need to load the pretrained ResNet18 model:
import torch from torchvision.models import resnet18 model_ft = resnet18(pretrained=True)
Accelerate Inference Using ONNXRuntime#
from bigdl.nano.pytorch import InferenceOptimizer ort_model = InferenceOptimizer.trace(model_ft, accelerator="onnxruntime", input_sample=torch.rand(1, 3, 224, 224))
Save Optimized Model#
The saved model files will be saved at “./optimized_model_ort” directory.
There are 2 files in optimized_model_ort, users only need to take “.onnx” file for further usage:
nano_model_meta.yml: meta information of the saved model checkpoint
onnx_saved_model.onnx: model checkpoint for general use, describes model structure
Load the Optimized Model#
loaded_model = InferenceOptimizer.load("./optimized_model_ort")
For a model accelerated by ONNXRuntime, we save the structure of its network. So, the original model is not needed when we load the optimized model.
Inference with the Loaded Model#
with InferenceOptimizer.get_context(loaded_model): x = torch.rand(2, 3, 224, 224) y_hat = loaded_model(x) predictions = y_hat.argmax(dim=1) print(predictions)
📚 Related Readings