Nano Tensorflow API

bigdl.nano.tf.keras

class bigdl.nano.tf.keras.Model(*args, **kwargs)[source]

A wrapper class for tf.keras.Model adding more functions for BigDL-Nano.

fit(x=None, y=None, batch_size=None, epochs=1, verbose='auto', callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, steps_per_epoch=None, validation_steps=None, validation_batch_size=None, validation_freq=1, max_queue_size=10, workers=1, use_multiprocessing=False, num_processes=None, backend='multiprocessing')

Override tf.keras.Model.fit to add more parameters.

All arguments that already exists in tf.keras.Model.fit has the same sementics with tf.keras.Model.fit.

Additional parameters: :param num_processes: when num_processes is not None, it specifies how many sub-processes

to launch to run pseudo-distributed training; when num_processes is None, training will run in the current process.

Parameters

backend – when num_processes is not None, it specifies which backend to use when launching sub-processes to run psedu-distributed training; when num_processes is None, this parameter takes no effect.

quantize(precision: str = 'int8', accelerator: Optional[str] = None, calib_dataset: Optional[tensorflow.python.data.ops.dataset_ops.DatasetV1] = None, metric: Optional[tensorflow.python.keras.metrics.Metric] = None, accuracy_criterion: Optional[dict] = None, approach: str = 'static', method: Optional[str] = None, conf: Optional[str] = None, tuning_strategy: Optional[str] = None, timeout: Optional[int] = None, max_trials: Optional[int] = None, batch=None, inputs: Optional[List[str]] = None, outputs: Optional[List[str]] = None)

Post-training quantization on a keras model.

Parameters
  • calib_dataset – A tf.data.Dataset object for calibration. Required for static quantization. It’s also used as validation dataloader.

  • precision – Global precision of quantized model, supported type: ‘int8’, ‘bf16’, ‘fp16’, defaults to ‘int8’.

  • accelerator – Use accelerator ‘None’, ‘onnxruntime’, ‘openvino’, defaults to None. None means staying in tensorflow.

  • metric – A tensorflow.keras.metrics.Metric object for evaluation.

  • accuracy_criterion – Tolerable accuracy drop. accuracy_criterion = {‘relative’: 0.1, ‘higher_is_better’: True} allows relative accuracy loss: 1%. accuracy_criterion = {‘absolute’: 0.99, ‘higher_is_better’:False} means accuracy must be smaller than 0.99.

  • approach – ‘static’ or ‘dynamic’. ‘static’: post_training_static_quant, ‘dynamic’: post_training_dynamic_quant. Default: ‘static’. OpenVINO supports static mode only.

  • method – Method to do quantization. When accelerator=None, supported methods: None. When accelerator=’onnxruntime’, supported methods: ‘qlinear’, ‘integer’, defaults to ‘qlinear’. Suggest ‘qlinear’ for lower accuracy drop if using static quantization. More details in https://onnxruntime.ai/docs/performance/quantization.html. This argument doesn’t take effect for OpenVINO, don’t change it for OpenVINO.

  • conf – A path to conf yaml file for quantization. Default: None, using default config.

  • tuning_strategy – ‘bayesian’, ‘basic’, ‘mse’, ‘sigopt’. Default: ‘bayesian’.

  • timeout – Tuning timeout (seconds). Default: None, which means early stop. Combine with max_trials field to decide when to exit.

  • max_trials – Max tune times. Default: None, which means no tuning. Combine with timeout field to decide when to exit. “timeout=0, max_trials=1” means it will try quantization only once and return satisfying best model.

  • batch – Batch size of dataloader for calib_dataset. Defaults to None, if the dataset is not a BatchDataset, batchsize equals to 1. Otherwise, batchsize complies with the dataset._batch_size.

  • inputs – A list of input names. Default: None, automatically get names from graph.

  • outputs – A list of output names. Default: None, automatically get names from graph.

Returns

A TensorflowBaseModel for INC. If there is no model found, return None.

trace(accelerator=None, input_sample=None, onnxruntime_session_options=None)

Trace a Keras model and convert it into an accelerated module for inference.

For example, this function returns a KerasOpenVINOModel when accelerator==’openvino’.

Parameters
  • model – An torch.nn.Module model, including pl.LightningModule.

  • input_sample – A set of inputs for trace, defaults to None if you have trace before or model is a LightningModule with any dataloader attached.

  • accelerator – The accelerator to use, defaults to None meaning staying in Keras backend. ‘openvino’ and ‘onnxruntime’ are supported for now.

  • onnxruntime_session_options – The session option for onnxruntime, only valid when accelerator=’onnxruntime’, otherwise will be ignored.

Returns

Model with different acceleration(OpenVINO/ONNX Runtime).

class bigdl.nano.tf.keras.Sequential(*args, **kwargs)[source]

A wrapper class for tf.keras.Sequential adding more functions for BigDL-Nano.

Create a nano Sequential model, having the same arguments with tf.keras.Sequential.

fit(x=None, y=None, batch_size=None, epochs=1, verbose='auto', callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0, steps_per_epoch=None, validation_steps=None, validation_batch_size=None, validation_freq=1, max_queue_size=10, workers=1, use_multiprocessing=False, num_processes=None, backend='multiprocessing')

Override tf.keras.Model.fit to add more parameters.

All arguments that already exists in tf.keras.Model.fit has the same sementics with tf.keras.Model.fit.

Additional parameters: :param num_processes: when num_processes is not None, it specifies how many sub-processes

to launch to run pseudo-distributed training; when num_processes is None, training will run in the current process.

Parameters

backend – when num_processes is not None, it specifies which backend to use when launching sub-processes to run psedu-distributed training; when num_processes is None, this parameter takes no effect.

quantize(precision: str = 'int8', accelerator: Optional[str] = None, calib_dataset: Optional[tensorflow.python.data.ops.dataset_ops.DatasetV1] = None, metric: Optional[tensorflow.python.keras.metrics.Metric] = None, accuracy_criterion: Optional[dict] = None, approach: str = 'static', method: Optional[str] = None, conf: Optional[str] = None, tuning_strategy: Optional[str] = None, timeout: Optional[int] = None, max_trials: Optional[int] = None, batch=None, inputs: Optional[List[str]] = None, outputs: Optional[List[str]] = None)

Post-training quantization on a keras model.

Parameters
  • calib_dataset – A tf.data.Dataset object for calibration. Required for static quantization. It’s also used as validation dataloader.

  • precision – Global precision of quantized model, supported type: ‘int8’, ‘bf16’, ‘fp16’, defaults to ‘int8’.

  • accelerator – Use accelerator ‘None’, ‘onnxruntime’, ‘openvino’, defaults to None. None means staying in tensorflow.

  • metric – A tensorflow.keras.metrics.Metric object for evaluation.

  • accuracy_criterion – Tolerable accuracy drop. accuracy_criterion = {‘relative’: 0.1, ‘higher_is_better’: True} allows relative accuracy loss: 1%. accuracy_criterion = {‘absolute’: 0.99, ‘higher_is_better’:False} means accuracy must be smaller than 0.99.

  • approach – ‘static’ or ‘dynamic’. ‘static’: post_training_static_quant, ‘dynamic’: post_training_dynamic_quant. Default: ‘static’. OpenVINO supports static mode only.

  • method – Method to do quantization. When accelerator=None, supported methods: None. When accelerator=’onnxruntime’, supported methods: ‘qlinear’, ‘integer’, defaults to ‘qlinear’. Suggest ‘qlinear’ for lower accuracy drop if using static quantization. More details in https://onnxruntime.ai/docs/performance/quantization.html. This argument doesn’t take effect for OpenVINO, don’t change it for OpenVINO.

  • conf – A path to conf yaml file for quantization. Default: None, using default config.

  • tuning_strategy – ‘bayesian’, ‘basic’, ‘mse’, ‘sigopt’. Default: ‘bayesian’.

  • timeout – Tuning timeout (seconds). Default: None, which means early stop. Combine with max_trials field to decide when to exit.

  • max_trials – Max tune times. Default: None, which means no tuning. Combine with timeout field to decide when to exit. “timeout=0, max_trials=1” means it will try quantization only once and return satisfying best model.

  • batch – Batch size of dataloader for calib_dataset. Defaults to None, if the dataset is not a BatchDataset, batchsize equals to 1. Otherwise, batchsize complies with the dataset._batch_size.

  • inputs – A list of input names. Default: None, automatically get names from graph.

  • outputs – A list of output names. Default: None, automatically get names from graph.

Returns

A TensorflowBaseModel for INC. If there is no model found, return None.

trace(accelerator=None, input_sample=None, onnxruntime_session_options=None)

Trace a Keras model and convert it into an accelerated module for inference.

For example, this function returns a KerasOpenVINOModel when accelerator==’openvino’.

Parameters
  • model – An torch.nn.Module model, including pl.LightningModule.

  • input_sample – A set of inputs for trace, defaults to None if you have trace before or model is a LightningModule with any dataloader attached.

  • accelerator – The accelerator to use, defaults to None meaning staying in Keras backend. ‘openvino’ and ‘onnxruntime’ are supported for now.

  • onnxruntime_session_options – The session option for onnxruntime, only valid when accelerator=’onnxruntime’, otherwise will be ignored.

Returns

Model with different acceleration(OpenVINO/ONNX Runtime).