Nano Tensorflow API¶

bigdl.nano.tf.keras¶

class bigdl.nano.tf.keras.Model(*args, **kwargs)[source]¶

A wrapper class for tf.keras.Model adding more functions for BigDL-Nano.

fit(x=None, y=None, batch_size=None, epochs=1, verbose='auto', callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, steps_per_epoch=None, validation_steps=None, validation_batch_size=None, validation_freq=1, max_queue_size=10, workers=1, use_multiprocessing=False, num_processes=None, backend='multiprocessing')¶

Override tf.keras.Model.fit to add more parameters.

All arguments that already exists in tf.keras.Model.fit has the same sementics with tf.keras.Model.fit.

Additional parameters: :param num_processes: when num_processes is not None, it specifies how many sub-processes

to launch to run pseudo-distributed training; when num_processes is None, training will run in the current process.

Parameters: backend – when num_processes is not None, it specifies which backend to use when launching sub-processes to run psedu-distributed training; when num_processes is None, this parameter takes no effect.

quantize(precision: str = 'int8', accelerator: Optional[str] = None, calib_dataset: Optional[tensorflow.python.data.ops.dataset_ops.DatasetV1] = None, metric: Optional[tensorflow.python.keras.metrics.Metric] = None, accuracy_criterion: Optional[dict] = None, approach: str = 'static', method: Optional[str] = None, conf: Optional[str] = None, tuning_strategy: Optional[str] = None, timeout: Optional[int] = None, max_trials: Optional[int] = None, batch=None, inputs: Optional[List[str]] = None, outputs: Optional[List[str]] = None)¶

Post-training quantization on a keras model.

Parameters

calib_dataset – A tf.data.Dataset object for calibration. Required for static quantization. It’s also used as validation dataloader.
precision – Global precision of quantized model, supported type: ‘int8’, ‘bf16’, ‘fp16’, defaults to ‘int8’.
accelerator – Use accelerator ‘None’, ‘onnxruntime’, ‘openvino’, defaults to None. None means staying in tensorflow.
metric – A tensorflow.keras.metrics.Metric object for evaluation.
accuracy_criterion – Tolerable accuracy drop. accuracy_criterion = {‘relative’: 0.1, ‘higher_is_better’: True} allows relative accuracy loss: 1%. accuracy_criterion = {‘absolute’: 0.99, ‘higher_is_better’:False} means accuracy must be smaller than 0.99.
approach – ‘static’ or ‘dynamic’. ‘static’: post_training_static_quant, ‘dynamic’: post_training_dynamic_quant. Default: ‘static’. OpenVINO supports static mode only.
method – Method to do quantization. When accelerator=None, supported methods: None. When accelerator=’onnxruntime’, supported methods: ‘qlinear’, ‘integer’, defaults to ‘qlinear’. Suggest ‘qlinear’ for lower accuracy drop if using static quantization. More details in https://onnxruntime.ai/docs/performance/quantization.html. This argument doesn’t take effect for OpenVINO, don’t change it for OpenVINO.
conf – A path to conf yaml file for quantization. Default: None, using default config.
tuning_strategy – ‘bayesian’, ‘basic’, ‘mse’, ‘sigopt’. Default: ‘bayesian’.
timeout – Tuning timeout (seconds). Default: None, which means early stop. Combine with max_trials field to decide when to exit.
max_trials – Max tune times. Default: None, which means no tuning. Combine with timeout field to decide when to exit. “timeout=0, max_trials=1” means it will try quantization only once and return satisfying best model.
batch – Batch size of dataloader for calib_dataset. Defaults to None, if the dataset is not a BatchDataset, batchsize equals to 1. Otherwise, batchsize complies with the dataset._batch_size.
inputs – A list of input names. Default: None, automatically get names from graph.
outputs – A list of output names. Default: None, automatically get names from graph.

Returns

A TensorflowBaseModel for INC. If there is no model found, return None.

trace(accelerator=None, input_sample=None, onnxruntime_session_options=None)¶

Trace a Keras model and convert it into an accelerated module for inference.

For example, this function returns a KerasOpenVINOModel when accelerator==’openvino’.

Parameters

model – An torch.nn.Module model, including pl.LightningModule.
input_sample – A set of inputs for trace, defaults to None if you have trace before or model is a LightningModule with any dataloader attached.
accelerator – The accelerator to use, defaults to None meaning staying in Keras backend. ‘openvino’ and ‘onnxruntime’ are supported for now.
onnxruntime_session_options – The session option for onnxruntime, only valid when accelerator=’onnxruntime’, otherwise will be ignored.

Returns

Model with different acceleration(OpenVINO/ONNX Runtime).

class bigdl.nano.tf.keras.Sequential(*args, **kwargs)[source]¶

A wrapper class for tf.keras.Sequential adding more functions for BigDL-Nano.

Create a nano Sequential model, having the same arguments with tf.keras.Sequential.

fit(x=None, y=None, batch_size=None, epochs=1, verbose='auto', callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, steps_per_epoch=None, validation_steps=None, validation_batch_size=None, validation_freq=1, max_queue_size=10, workers=1, use_multiprocessing=False, num_processes=None, backend='multiprocessing')¶

Override tf.keras.Model.fit to add more parameters.

All arguments that already exists in tf.keras.Model.fit has the same sementics with tf.keras.Model.fit.

Additional parameters: :param num_processes: when num_processes is not None, it specifies how many sub-processes

to launch to run pseudo-distributed training; when num_processes is None, training will run in the current process.

Parameters: backend – when num_processes is not None, it specifies which backend to use when launching sub-processes to run psedu-distributed training; when num_processes is None, this parameter takes no effect.

quantize(precision: str = 'int8', accelerator: Optional[str] = None, calib_dataset: Optional[tensorflow.python.data.ops.dataset_ops.DatasetV1] = None, metric: Optional[tensorflow.python.keras.metrics.Metric] = None, accuracy_criterion: Optional[dict] = None, approach: str = 'static', method: Optional[str] = None, conf: Optional[str] = None, tuning_strategy: Optional[str] = None, timeout: Optional[int] = None, max_trials: Optional[int] = None, batch=None, inputs: Optional[List[str]] = None, outputs: Optional[List[str]] = None)¶

Post-training quantization on a keras model.

Parameters

calib_dataset – A tf.data.Dataset object for calibration. Required for static quantization. It’s also used as validation dataloader.
precision – Global precision of quantized model, supported type: ‘int8’, ‘bf16’, ‘fp16’, defaults to ‘int8’.
accelerator – Use accelerator ‘None’, ‘onnxruntime’, ‘openvino’, defaults to None. None means staying in tensorflow.
metric – A tensorflow.keras.metrics.Metric object for evaluation.
accuracy_criterion – Tolerable accuracy drop. accuracy_criterion = {‘relative’: 0.1, ‘higher_is_better’: True} allows relative accuracy loss: 1%. accuracy_criterion = {‘absolute’: 0.99, ‘higher_is_better’:False} means accuracy must be smaller than 0.99.
approach – ‘static’ or ‘dynamic’. ‘static’: post_training_static_quant, ‘dynamic’: post_training_dynamic_quant. Default: ‘static’. OpenVINO supports static mode only.
method – Method to do quantization. When accelerator=None, supported methods: None. When accelerator=’onnxruntime’, supported methods: ‘qlinear’, ‘integer’, defaults to ‘qlinear’. Suggest ‘qlinear’ for lower accuracy drop if using static quantization. More details in https://onnxruntime.ai/docs/performance/quantization.html. This argument doesn’t take effect for OpenVINO, don’t change it for OpenVINO.
conf – A path to conf yaml file for quantization. Default: None, using default config.
tuning_strategy – ‘bayesian’, ‘basic’, ‘mse’, ‘sigopt’. Default: ‘bayesian’.
timeout – Tuning timeout (seconds). Default: None, which means early stop. Combine with max_trials field to decide when to exit.
max_trials – Max tune times. Default: None, which means no tuning. Combine with timeout field to decide when to exit. “timeout=0, max_trials=1” means it will try quantization only once and return satisfying best model.
batch – Batch size of dataloader for calib_dataset. Defaults to None, if the dataset is not a BatchDataset, batchsize equals to 1. Otherwise, batchsize complies with the dataset._batch_size.
inputs – A list of input names. Default: None, automatically get names from graph.
outputs – A list of output names. Default: None, automatically get names from graph.

Returns

A TensorflowBaseModel for INC. If there is no model found, return None.

trace(accelerator=None, input_sample=None, onnxruntime_session_options=None)¶

Trace a Keras model and convert it into an accelerated module for inference.

For example, this function returns a KerasOpenVINOModel when accelerator==’openvino’.

Parameters

model – An torch.nn.Module model, including pl.LightningModule.
input_sample – A set of inputs for trace, defaults to None if you have trace before or model is a LightningModule with any dataloader attached.
accelerator – The accelerator to use, defaults to None meaning staying in Keras backend. ‘openvino’ and ‘onnxruntime’ are supported for now.
onnxruntime_session_options – The session option for onnxruntime, only valid when accelerator=’onnxruntime’, otherwise will be ignored.

Returns

Model with different acceleration(OpenVINO/ONNX Runtime).