View the runnable example on GitHub
Convert PyTorch Training Loop to Use TorchNano
#
📚 Related Reading
If you have already defined a PyTorch training loop function with a model, optimizers, and dataloaders as parameters, you could refer to this guide to use
@nano
decorator, which is a simpler way to gain acceleration from BigDL-Nano.
TorchNano
API integrates multiple optimizations to accelerate custom PyTorch training loop. As a pure PyTorch user, you could apply few changes to your existing code to use TorchNano
.
📝 Note
Before starting your PyTorch application, it is highly recommended to run
source bigdl-nano-init
to set several environment variables based on your current hardware. Empirically, these variables will bring big performance increase for most PyTorch applications on training workloads.
PyTorch Training Loops Example#
Suppose you would like to finetune a ResNet-18 model (pretrained on ImageNet dataset) on OxfordIIITPet dataset, you may create datasets, the model and define your training loops as follows:
[ ]:
from tqdm import tqdm
def train_loops():
model = MyPytorchModule()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9, weight_decay=5e-4)
loss_fuc = torch.nn.CrossEntropyLoss()
train_loader = create_train_dataloader()
num_epochs = 5
for epoch in range(num_epochs):
model.train()
train_loss, num = 0, 0
with tqdm(train_loader, unit="batch") as tepoch:
for data, target in tepoch:
tepoch.set_description(f"Epoch {epoch}")
optimizer.zero_grad()
output = model(data)
loss = loss_fuc(output, target)
loss.backward()
optimizer.step()
loss_value = loss.sum()
train_loss += loss_value
num += 1
tepoch.set_postfix(loss=loss_value)
print(f'Train Epoch: {epoch}, avg_loss: {train_loss / num}')
The definition of MyPytorchModule
and create_train_dataloader
can be found in the runnable example.
Convert to TorchNano
#
There are 5 simple steps to convert your PyTorch code to use TorchNano
:
Import
TorchNano
Subclass
TorchNano
and override itstrain
methodMove the code for your custom training loops inside the
TorchNano
’strain
methodCall
TorchNano
’ssetup
method to set up model, optimizer(s), and dataloader(s) for accelerated trainingReplace
loss.backward()
withself.backward(loss)
[ ]:
# Step 1. import TorchNano
from bigdl.nano.pytorch import TorchNano
# Step 2. subclass TorchNano and override its train method
class MyNano(TorchNano):
def train(self):
# Step 3. Move the code for your custom training loops inside the train method
model = MyPytorchModule()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9, weight_decay=5e-4)
loss_fuc = torch.nn.CrossEntropyLoss()
train_loader = create_train_dataloader()
# Step 4. call setup method to set up model, optimizer(s),
# and dataloader(s) for accelerated training
model, optimizer, train_loader = self.setup(model, optimizer, train_loader)
num_epochs = 5
for epoch in range(num_epochs):
model.train()
train_loss, num = 0, 0
with tqdm(train_loader, unit="batch") as tepoch:
for data, target in tepoch:
tepoch.set_description(f"Epoch {epoch}")
optimizer.zero_grad()
output = model(data)
loss = loss_fuc(output, target)
# Step 5. Replace loss.backward() with self.backward(loss)
self.backward(loss)
optimizer.step()
loss_value = loss.sum()
train_loss += loss_value
num += 1
tepoch.set_postfix(loss=loss_value)
print(f'Train Epoch: {epoch}, avg_loss: {train_loss / num}')
📝 Note
To make sure that the converted
TorchNano
still has a functional training loop, there are some requirements:
there should be one and only one instance of
torch.nn.Module
as model in the training loopthere should be at least one instance of
torch.optim.Optimizer
as optimizer in the training loopthere should be at least one instance of
torch.utils.data.DataLoader
as dataloader in the training loop
You could then do the training by instantiating MyNano
and calling its train
method:
[ ]:
MyNano().train()
📝 Note
Due to the optimized environment variables set by
source bigdl-nano-init
, you could already experience some training acceleration after converting your PyTorch code to useTorchNano
.For more optimizations provided by
TorchNano
, you can refer to the Related Readings.
📚 Related Readings
How to accelerate a PyTorch application on training workloads through Intel® Extension for PyTorch*
How to accelerate a PyTorch application on training workloads through multiple instances
How to use the channels last memory format in your PyTorch application for training
How to conduct BFloat16 Mixed Precision training in your PyTorch application