Use `@nano` Decorator to Accelerate PyTorch Training Loop#

BigDL-Nano integrates multiple optimizations to accelerate PyTorch training workloads. As a pure PyTorch user, you could simply wrap your custom PyTorch training loop with @nano decorator to benefit from BigDL-Nano.

📝 Note

Before starting your PyTorch application, it is highly recommended to run source bigdl-nano-init to set several environment variables based on your current hardware. Empirically, these variables will bring big performance increase for most PyTorch applications on training workloads.

Suppose you define your custom PyTorch training loop as follows. To benefit from BigDL-Nano integrated optimizations, you could simply import nano decorator, and wrap the training loop with it.

[ ]:

from tqdm import tqdm
from bigdl.nano.pytorch import nano # import nano decorator

@nano() # apply the decorator to the training loop
def training_loop(model, optimizer, train_loader, num_epochs, loss_func):

    for epoch in range(num_epochs):

        model.train()
        train_loss, num = 0, 0
        with tqdm(train_loader, unit="batch") as tepoch:
            for data, target in tepoch:
                tepoch.set_description(f"Epoch {epoch}")
                optimizer.zero_grad()
                output = model(data)
                loss = loss_func(output, target)
                loss.backward()
                optimizer.step()
                loss_value = loss.sum()
                train_loss += loss_value
                num += 1
                tepoch.set_postfix(loss=loss_value)
            print(f'Train Epoch: {epoch}, avg_loss: {train_loss / num}')

📝 Note

To make sure @nano is functional on your custom training loop, there are some requirements for its parameter lists:

there should be one and only one instance of torch.nn.Module passed in the training loop as model

there should be at least one instance of torch.optim.Optimizer passed in the training loop as optimizer

there should be at least one instance of torch.utils.data.DataLoader passed in the training loop as dataloader

You could then call the training_loop method as normal:

[ ]:

model = MyPytorchModule()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9, weight_decay=5e-4)
loss_func = torch.nn.CrossEntropyLoss()
train_loader = create_train_dataloader()

training_loop(model, optimizer, train_loader, num_epochs=5, loss_func=loss_func)

The definition of MyPytorchModule and create_train_dataloader can be found in the runnable example.

📝 Note

Due to the optimized environment variables set by source bigdl-nano-init, you could already experience some training acceleration after wrapping your custom training loop with @nano decorator.

For more optimizations provided by @nano decorator, you can refer to the Related Readings.

📚 Related Readings

How to install BigDL-Nano

How to accelerate a PyTorch application on training workloads through Intel® Extension for PyTorch*

How to accelerate a PyTorch application on training workloads through multiple instances

How to use the channels last memory format in your PyTorch application for training

How to conduct BFloat16 Mixed Precision training in your PyTorch application

Use @nano Decorator to Accelerate PyTorch Training Loop#

Use `@nano` Decorator to Accelerate PyTorch Training Loop#