Orca Learn#

orca.learn.bigdl.estimator#

class bigdl.orca.learn.bigdl.estimator.Estimator[source]#

Bases: object

static from_bigdl(*, model, loss=None, optimizer=None, metrics=None, feature_preprocessing=None, label_preprocessing=None, model_dir=None)[source]#

Construct an Estimator with BigDL model, loss function and Preprocessing for feature and label data.

Parameters
  • model – BigDL Model to be trained.

  • loss – BigDL criterion.

  • optimizer – BigDL optimizer.

  • metrics – A evaluation metric or a list of evaluation metrics

  • feature_preprocessing

    Used when data in fit and predict is a Spark DataFrame. The param converts the data in feature column to a Tensor or to a Sample directly. It expects a List of Int as the size of the converted Tensor, or a Preprocessing[F, Tensor[T]]

    If a List of Int is set as feature_preprocessing, it can only handle the case that feature column contains the following data types: Float, Double, Int, Array[Float], Array[Double], Array[Int] and MLlib Vector. The feature data are converted to Tensors with the specified sizes before sending to the model. Internally, a SeqToTensor is generated according to the size, and used as the feature_preprocessing.

    Alternatively, user can set feature_preprocessing as Preprocessing[F, Tensor[T]] that transforms the feature data to a Tensor[T]. Some pre-defined Preprocessing are provided in package bigdl.dllib.feature. Multiple Preprocessing can be combined as a ChainedPreprocessing.

    The feature_preprocessing will also be copied to the generated NNModel and applied to feature column during transform.

  • label_preprocessing – Used when data in fit and predict is a Spark DataFrame. similar to feature_preprocessing, but applies to Label data.

  • model_dir – The path to save model. During the training, if checkpoint_trigger is defined and triggered, the model will be saved to model_dir.

Returns

class bigdl.orca.learn.bigdl.estimator.BigDLEstimator(*, model, loss, optimizer=None, metrics=None, feature_preprocessing=None, label_preprocessing=None, model_dir=None)[source]#

Bases: bigdl.orca.learn.spark_estimator.Estimator

fit(data, epochs, batch_size=32, feature_cols='features', label_cols='label', caching_sample=True, validation_data=None, validation_trigger=None, checkpoint_trigger=None)[source]#

Train this BigDL model with train data.

Parameters
  • data – train data. It can be XShards or Spark DataFrame. If data is XShards, each partition is a dictionary of {‘x’: feature, ‘y’: label}, where feature(label) is a numpy array or a list of numpy arrays.

  • epochs – Number of epochs to train the model.

  • batch_size – Batch size used for training. Default: 32.

  • feature_cols – Feature column name(s) of data. Only used when data is a Spark DataFrame. Default: “features”.

  • label_cols – Label column name(s) of data. Only used when data is a Spark DataFrame. Default: “label”.

  • caching_sample – whether to cache the Samples after preprocessing. Default: True

  • validation_data – Validation data. XShards and Spark DataFrame are supported. If data is XShards, each partition is a dictionary of {‘x’: feature, ‘y’: label}, where feature(label) is a numpy array or a list of numpy arrays.

  • validation_trigger – Orca Trigger to trigger validation computation.

  • checkpoint_trigger – Orca Trigger to set a checkpoint.

Returns

predict(data, batch_size=4, feature_cols='features', sample_preprocessing=None)[source]#

Predict input data

Parameters
  • data – predict input data. It can be XShards or Spark DataFrame. If data is XShards, each partition is a dictionary of {‘x’: feature}, where feature is a numpy array or a list of numpy arrays.

  • batch_size – Batch size used for inference. Default: 4.

  • feature_cols – Feature column name(s) of data. Only used when data is a Spark DataFrame. Default: “features”.

  • sample_preprocessing – Used when data is a Spark DataFrame. If the user want change the default feature_preprocessing specified in Estimator.from_bigdl, the user can pass the new sample_preprocessing methods.

Returns

predicted result. If input data is Spark DataFrame, the predict result is a DataFrame which includes original columns plus ‘prediction’ column. The ‘prediction’ column can be FloatType, VectorUDT or Array of VectorUDT depending on model outputs shape. If input data is an XShards, the predict result is a XShards, each partition of the XShards is a dictionary of {‘prediction’: result}, where result is a numpy array or a list of numpy arrays.

evaluate(data, batch_size=32, feature_cols='features', label_cols='label')[source]#

Evaluate model.

Parameters
  • data – validation data. It can be XShardsor or Spark DataFrame, each partition is a dictionary of {‘x’: feature, ‘y’: label}, where feature(label) is a numpy array or a list of numpy arrays.

  • batch_size – Batch size used for validation. Default: 32.

  • feature_cols – (Not supported yet) Feature column name(s) of data. Only used when data is a Spark DataFrame. Default: None.

  • label_cols – (Not supported yet) Label column name(s) of data. Only used when data is a Spark DataFrame. Default: None.

Returns

get_model()[source]#

Get the trained BigDL model

Returns

The trained BigDL model

save(model_path)[source]#

Save the BigDL model to model_path

Parameters

model_path – path to save the trained model.

Returns

load(checkpoint, optimizer=None, loss=None, feature_preprocessing=None, label_preprocessing=None, model_dir=None, is_checkpoint=False)[source]#

Load existing BigDL model or checkpoint

Parameters
  • checkpoint – Path to the existing model or checkpoint.

  • optimizer – BigDL optimizer.

  • loss – BigDL criterion.

  • feature_preprocessing

    Used when data in fit and predict is a Spark DataFrame. The param converts the data in feature column to a Tensor or to a Sample directly. It expects a List of Int as the size of the converted Tensor, or a Preprocessing[F, Tensor[T]]

    If a List of Int is set as feature_preprocessing, it can only handle the case that feature column contains the following data types: Float, Double, Int, Array[Float], Array[Double], Array[Int] and MLlib Vector. The feature data are converted to Tensors with the specified sizes before sending to the model. Internally, a SeqToTensor is generated according to the size, and used as the feature_preprocessing.

    Alternatively, user can set feature_preprocessing as Preprocessing[F, Tensor[T]] that transforms the feature data to a Tensor[T]. Some pre-defined Preprocessing are provided in package bigdl.dllib.feature. Multiple Preprocessing can be combined as a ChainedPreprocessing.

    The feature_preprocessing will also be copied to the generated NNModel and applied to feature column during transform.

  • label_preprocessing – Used when data in fit and predict is a Spark DataFrame. similar to feature_preprocessing, but applies to Label data.

  • model_dir – The path to save model. During the training, if checkpoint_trigger is defined and triggered, the model will be saved to model_dir.

  • is_checkpoint – Whether the path is a checkpoint or a saved BigDL model. Default: False.

Returns

The loaded estimator object.

load_orca_checkpoint(path, version=None, prefix=None)[source]#

Load existing checkpoint. To load a specific checkpoint, please provide both version and perfix. If version is None, then the latest checkpoint under the specified directory will be loaded.

Parameters
  • path – Path to the existing checkpoint (or directory containing Orca checkpoint files).

  • version – checkpoint version, which is the suffix of model.* file, i.e., for modle.4 file, the version is 4. If it is None, then load the latest checkpoint.

  • prefix – optimMethod prefix, for example ‘optimMethod-Sequentialf53bddcc’

Returns

clear_gradient_clipping()[source]#

Clear gradient clipping parameters. In this case, gradient clipping will not be applied. In order to take effect, it needs to be called before fit.

Returns

set_constant_gradient_clipping(min, max)[source]#

Set constant gradient clipping during the training process. In order to take effect, it needs to be called before fit.

Parameters
  • min – The minimum value to clip by.

  • max – The maximum value to clip by.

Returns

set_l2_norm_gradient_clipping(clip_norm)[source]#

Clip gradient to a maximum L2-Norm during the training process. In order to take effect, it needs to be called before fit.

Parameters

clip_norm – Gradient L2-Norm threshold.

Returns

get_train_summary(tag=None)[source]#

Get the scalar from model train summary.

This method will return a list of summary data of [iteration_number, scalar_value, timestamp].

Parameters

tag – The string variable represents the scalar wanted

get_validation_summary(tag=None)[source]#

Get the scalar from model validation summary.

This method will return a list of summary data of [iteration_number, scalar_value, timestamp]. Note that the metric and tag may not be consistent. Please look up following form to pass tag parameter. Left side is your metric during compile. Right side is the tag you should pass.

>>> 'Accuracy'                  |   'Top1Accuracy'
>>> 'BinaryAccuracy'            |   'Top1Accuracy'
>>> 'CategoricalAccuracy'       |   'Top1Accuracy'
>>> 'SparseCategoricalAccuracy' |   'Top1Accuracy'
>>> 'AUC'                       |   'AucScore'
>>> 'HitRatio'                  |   'HitRate@k' (k is Top-k)
>>> 'Loss'                      |   'Loss'
>>> 'MAE'                       |   'MAE'
>>> 'NDCG'                      |   'NDCG'
>>> 'TFValidationMethod'        |   '${name + " " + valMethod.toString()}'
>>> 'Top5Accuracy'              |   'Top5Accuracy'
>>> 'TreeNNAccuracy'            |   'TreeNNAccuracy()'
>>> 'MeanAveragePrecision'      |   'MAP@k' (k is Top-k) (BigDL)
>>> 'MeanAveragePrecision'      |   'PascalMeanAveragePrecision' (Zoo)
>>> 'StatelessMetric'           |   '${name}'
Parameters

tag – The string variable represents the scalar wanted

orca.learn.tf.estimator#

class bigdl.orca.learn.tf.estimator.Estimator[source]#

Bases: bigdl.orca.learn.spark_estimator.Estimator

fit(data, epochs, batch_size=32, feature_cols=None, label_cols=None, validation_data=None, session_config=None, checkpoint_trigger=None, auto_shard_files=False)[source]#

Train the model with train data.

Parameters
  • data – train data. It can be XShards, Spark DataFrame, tf.data.Dataset. If data is XShards, each partition can be a Pandas DataFrame or a dictionary of {‘x’: feature, ‘y’: label}, where feature(label) is a numpy array or a tuple of numpy arrays.

  • epochs – number of epochs to train.

  • batch_size – total batch size for each iteration. Default: 32.

  • feature_cols – feature column names if train data is Spark DataFrame or XShards of Pandas DataFrame.

  • label_cols – label column names if train data is Spark DataFrame or XShards of Pandas DataFrame.

  • validation_data – validation data. Validation data type should be the same as train data.

  • session_config – tensorflow session configuration for training. Should be object of tf.ConfigProto

  • checkpoint_trigger – when to trigger checkpoint during training. Should be a bigdl.orca.learn.trigger, like EveryEpoch(), SeveralIteration( num_iterations),etc.

  • auto_shard_files – whether to automatically detect if the dataset is file-based and and apply sharding on files, otherwise sharding on records. Default is False.

predict(data, batch_size=4, feature_cols=None, auto_shard_files=False)[source]#

Predict input data

Parameters
  • data – data to be predicted. It can be XShards, Spark DataFrame. If data is XShards, each partition can be a Pandas DataFrame or a dictionary of {‘x’: feature}, where feature is a numpy array or a tuple of numpy arrays.

  • batch_size – batch size per thread

  • feature_cols – list of feature column names if input data is Spark DataFrame or XShards of Pandas DataFrame.

  • auto_shard_files – whether to automatically detect if the dataset is file-based and and apply sharding on files, otherwise sharding on records. Default is False.

Returns

predicted result. If input data is XShards or tf.data.Dataset, the predict result is a XShards, each partition of the XShards is a dictionary of {‘prediction’: result}, where the result is a numpy array or a list of numpy arrays. If input data is Spark DataFrame, the predict result is a DataFrame which includes original columns plus ‘prediction’ column. The ‘prediction’ column can be FloatType, VectorUDT or Array of VectorUDT depending on model outputs shape.

evaluate(data, batch_size=32, feature_cols=None, label_cols=None, auto_shard_files=False)[source]#

Evaluate model.

Parameters
  • data – evaluation data. It can be XShards, Spark DataFrame, tf.data.Dataset. If data is XShards, each partition can be a Pandas DataFrame or a dictionary of {‘x’: feature, ‘y’: label}, where feature(label) is a numpy array or a tuple of numpy arrays. If data is tf.data.Dataset, each element is a tuple of input tensors.

  • batch_size – batch size per thread.

  • feature_cols – feature_cols: feature column names if train data is Spark DataFrame or XShards of Pandas DataFrame.

  • label_cols – label column names if train data is Spark DataFrame or XShards of Pandas DataFrame.

  • auto_shard_files – whether to automatically detect if the dataset is file-based and and apply sharding on files, otherwise sharding on records. Default is False.

Returns

evaluation result as a dictionary of {‘metric name’: metric value}

get_model()[source]#

Get the trained Tensorflow model

Returns

Trained model

save(model_path)[source]#

Save model to model_path

Parameters

model_path – path to save the trained model.

Returns

load(model_path)[source]#

Load existing model

Parameters

model_path – Path to the existing model.

Returns

clear_gradient_clipping()[source]#

Clear gradient clipping parameters. In this case, gradient clipping will not be applied. In order to take effect, it needs to be called before fit.

Returns

set_constant_gradient_clipping(min, max)[source]#

Set constant gradient clipping during the training process. In order to take effect, it needs to be called before fit.

Parameters
  • min – The minimum value to clip by.

  • max – The maximum value to clip by.

Returns

set_l2_norm_gradient_clipping(clip_norm)[source]#

Clip gradient to a maximum L2-Norm during the training process. In order to take effect, it needs to be called before fit.

Parameters

clip_norm – Gradient L2-Norm threshold.

Returns

get_train_summary(tag: Optional[str] = None) Optional[List[List[float]]][source]#

Get the scalar from model train summary.

This method will return a list of summary data of [iteration_number, scalar_value, timestamp].

Parameters

tag – The string variable represents the scalar wanted

get_validation_summary(tag: Optional[str] = None) Optional[List[List[float]]][source]#

Get the scalar from model validation summary.

This method will return a list of summary data of [iteration_number, scalar_value, timestamp]. Note that the metric and tag may not be consistent. Please look up following form to pass tag parameter. Left side is your metric during compile. Right side is the tag you should pass.

>>> 'Accuracy'                  |   'Top1Accuracy'
>>> 'BinaryAccuracy'            |   'Top1Accuracy'
>>> 'CategoricalAccuracy'       |   'Top1Accuracy'
>>> 'SparseCategoricalAccuracy' |   'Top1Accuracy'
>>> 'AUC'                       |   'AucScore'
>>> 'HitRatio'                  |   'HitRate@k' (k is Top-k)
>>> 'Loss'                      |   'Loss'
>>> 'MAE'                       |   'MAE'
>>> 'NDCG'                      |   'NDCG'
>>> 'TFValidationMethod'        |   '${name + " " + valMethod.toString()}'
>>> 'Top5Accuracy'              |   'Top5Accuracy'
>>> 'TreeNNAccuracy'            |   'TreeNNAccuracy()'
>>> 'MeanAveragePrecision'      |   'MAP@k' (k is Top-k) (BigDL)
>>> 'MeanAveragePrecision'      |   'PascalMeanAveragePrecision' (Zoo)
>>> 'StatelessMetric'           |   '${name}'
Parameters

tag – The string variable represents the scalar wanted

save_tf_checkpoint(path)[source]#

Save tensorflow checkpoint in this estimator.

Parameters

path – tensorflow checkpoint path.

load_tf_checkpoint(path)[source]#

Load tensorflow checkpoint to this estimator.

Parameters

path – tensorflow checkpoint path.

save_keras_model(path, overwrite=True)[source]#

Save tensorflow keras model in this estimator.

Parameters
  • path – keras model save path.

  • overwrite – Whether to silently overwrite any existing file at the target location.

save_keras_weights(filepath, overwrite=True, save_format=None)[source]#

Save tensorflow keras model weights in this estimator.

Parameters
  • filepath – keras model weights save path.

  • overwrite – Whether to silently overwrite any existing file at the target location.

  • save_format – Either ‘tf’ or ‘h5’. A filepath ending in ‘.h5’ or ‘.keras’ will default to HDF5 if save_format is None. Otherwise None defaults to ‘tf’.

load_keras_weights(filepath, by_name=False)[source]#

Save tensorflow keras model in this estimator.

Parameters
  • filepath – keras model weights save path.

  • by_name – Boolean, whether to load weights by name or by topological order. Only topological loading is supported for weight files in TensorFlow format.

load_orca_checkpoint(path: str, version: Optional[int] = None) None[source]#

Load Orca checkpoint. To load a specific checkpoint, please provide a version. If version is None, then the latest checkpoint will be loaded.

Parameters
  • path – checkpoint directory which contains model.* and optimMethod-TFParkTraining.* files.

  • version – checkpoint version, which is the suffix of model.* file, i.e., for modle.4 file, the version is 4.

static from_graph(*, inputs: Tensor, outputs: Optional[Tensor] = None, labels: Optional[Tensor] = None, loss: Optional[Tensor] = None, optimizer: Optional[Optimizer] = None, metrics: Optional[Metric] = None, clip_norm: Optional[float] = None, clip_value: Optional[Union[float, Tuple[float, float]]] = None, updates: Optional[List[Variable]] = None, sess: Optional[Session] = None, model_dir: Optional[str] = None, backend: str = 'bigdl') bigdl.orca.learn.spark_estimator.Estimator[source]#

Create an Estimator for tesorflow graph.

Parameters
  • inputs – input tensorflow tensors.

  • outputs – output tensorflow tensors.

  • labels – label tensorflow tensors.

  • loss – The loss tensor of the TensorFlow model, should be a scalar

  • optimizer – tensorflow optimization method.

  • clip_norm – float >= 0. Gradients will be clipped when their L2 norm exceeds this value.

  • clip_value – a float >= 0 or a tuple of two floats. If clip_value is a float, gradients will be clipped when their absolute value exceeds this value. If clip_value is a tuple of two floats, gradients will be clipped when their value less than clip_value[0] or larger than clip_value[1].

  • metrics – metric tensor.

  • updates – Collection for the update ops. For example, when performing batch normalization, the moving_mean and moving_variance should be updated and the user should add tf.GraphKeys.UPDATE_OPS to updates. Default is None.

  • sess – the current TensorFlow Session, if you want to used a pre-trained model, you should use the Session to load the pre-trained variables and pass it to estimator

  • model_dir – location to save model checkpoint and summaries.

  • backend – backend for estimator. Now it only can be “bigdl”.

Returns

an Estimator object.

static from_keras(keras_model: Model, metrics: Optional[Metric] = None, model_dir: Optional[str] = None, optimizer: Optional[Optimizer] = None, backend: str = 'bigdl') bigdl.orca.learn.spark_estimator.Estimator[source]#

Create an Estimator from a tensorflow.keras model. The model must be compiled.

Parameters
  • keras_model – the tensorflow.keras model, which must be compiled.

  • metrics – user specified metric.

  • model_dir – location to save model checkpoint and summaries.

  • optimizer – an optional orca optimMethod that will override the optimizer in keras_model.compile

  • backend – backend for estimator. Now it only can be “bigdl”.

Returns

an Estimator object.

static load_keras_model(path: str) bigdl.orca.learn.spark_estimator.Estimator[source]#

Create Estimator by loading an existing keras model (with weights) from HDF5 file.

Parameters

path – String. The path to the pre-defined model.

Returns

Orca TF Estimator.

bigdl.orca.learn.tf.estimator.is_tf_data_dataset(data: Any) bool[source]#
bigdl.orca.learn.tf.estimator.to_dataset(data: Union[bigdl.orca.data.shard.SparkXShards, bigdl.orca.data.tf.data.Dataset, pyspark.sql.dataframe.DataFrame, tensorflow.python.data.ops.dataset_ops.DatasetV1], batch_size: int, batch_per_thread: int, validation_data: Optional[Union[bigdl.orca.data.shard.SparkXShards, bigdl.orca.data.tf.data.Dataset, pyspark.sql.dataframe.DataFrame, tensorflow.python.data.ops.dataset_ops.DatasetV1]], feature_cols: Optional[List[str]], label_cols: Optional[List[str]], hard_code_batch_size: bool, sequential_order: bool, shuffle: bool, auto_shard_files: bool, memory_type: str = 'DRAM') Union[bigdl.orca.tfpark.tf_dataset.TFNdarrayDataset, bigdl.orca.data.tf.tf1_data.TF1Dataset, bigdl.orca.tfpark.tf_dataset.DataFrameDataset, bigdl.orca.tfpark.tf_dataset.TFDataDataset][source]#
bigdl.orca.learn.tf.estimator.save_model_dir(model_dir: str) str[source]#
class bigdl.orca.learn.tf.estimator.TensorFlowEstimator(*, inputs: Tensor, outputs: Optional[Tensor], labels: Optional[Tensor], loss: Optional[Tensor], optimizer: Optional[Optimizer], clip_norm: Optional[float], clip_value: Optional[Union[float, Tuple[float, float]]], metrics: Optional[Metric], updates: Optional[List[Variables]], sess: Optional[Session], model_dir: Optional[str])[source]#

Bases: bigdl.orca.learn.tf.estimator.Estimator

fit(data: Union[SparkXShards, DataFrame, tf.data.Dataset], epochs: int = 1, batch_size: int = 32, feature_cols: Optional[List[str]] = None, label_cols: Optional[List[str]] = None, validation_data: Optional[Any] = None, session_config: Optional[ConfigProto] = None, checkpoint_trigger: Optional[Trigger] = None, auto_shard_files: bool = False, feed_dict: Optional[Dict[Tensor, Tuple[Tensor, Tensor]]] = None) bigdl.orca.learn.tf.estimator.Estimator[source]#

Train this graph model with train data.

Parameters
  • data – train data. It can be XShards, Spark DataFrame, tf.data.Dataset. If data is XShards, each partition can be a Pandas DataFrame or a dictionary of {‘x’: feature, ‘y’: label}, where feature(label) is a numpy array or a tuple of numpy arrays. If data is tf.data.Dataset, each element is a tuple of input tensors.

  • epochs – number of epochs to train.

  • batch_size – total batch size for each iteration.

  • feature_cols – feature column names if train data is Spark DataFrame or XShards of Pandas DataFrame.

  • label_cols – label column names if train data is Spark DataFrame or XShards of Pandas DataFrame.

  • validation_data – validation data. Validation data type should be the same as train data.

  • auto_shard_files – whether to automatically detect if the dataset is file-based and and apply sharding on files, otherwise sharding on records. Default is False.

  • session_config – tensorflow session configuration for training. Should be object of tf.ConfigProto

  • feed_dict – a dictionary. The key is TensorFlow tensor, usually a placeholder, the value of the dictionary is a tuple of two elements. The first one of the tuple is the value to feed to the tensor in training phase and the second one is the value to feed to the tensor in validation phase.

  • checkpoint_trigger – when to trigger checkpoint during training. Should be a bigdl.orca.learn.trigger, like EveryEpoch(), SeveralIteration( num_iterations),etc.

predict(data: Union[bigdl.orca.data.shard.SparkXShards, pyspark.sql.dataframe.DataFrame], batch_size: int = 4, feature_cols: Optional[List[str]] = None, auto_shard_files: bool = False) Union[bigdl.orca.data.shard.SparkXShards, pyspark.sql.dataframe.DataFrame][source]#

Predict input data

Parameters
  • data – data to be predicted. It can be XShards, Spark DataFrame. If data is XShards, each partition can be a Pandas DataFrame or a dictionary of {‘x’: feature}, where feature is a numpy array or a tuple of numpy arrays.

  • batch_size – batch size per thread

  • feature_cols – list of feature column names if input data is Spark DataFrame or XShards of Pandas DataFrame.

  • auto_shard_files – whether to automatically detect if the dataset is file-based and and apply sharding on files, otherwise sharding on records. Default is False.

Returns

predicted result. If input data is XShards or tf.data.Dataset, the predict result is a XShards, each partition of the XShards is a dictionary of {‘prediction’: result}, where the result is a numpy array or a list of numpy arrays. If input data is Spark DataFrame, the predict result is a DataFrame which includes original columns plus ‘prediction’ column. The ‘prediction’ column can be FloatType, VectorUDT or Array of VectorUDT depending on model outputs shape.

evaluate(data: Union[bigdl.orca.data.shard.SparkXShards, pyspark.sql.dataframe.DataFrame, tensorflow.python.data.ops.dataset_ops.DatasetV1], batch_size: int = 32, feature_cols: Optional[List[str]] = None, label_cols: Optional[List[str]] = None, auto_shard_files: bool = False) Dict[str, float][source]#

Evaluate model.

Parameters
  • data – evaluation data. It can be XShards, Spark DataFrame, tf.data.Dataset. If data is XShards, each partition can be a Pandas DataFrame or a dictionary of {‘x’: feature, ‘y’: label}, where feature(label) is a numpy array or a tuple of numpy arrays. If data is tf.data.Dataset, each element is a tuple of input tensors.

  • batch_size – batch size per thread.

  • feature_cols – feature_cols: feature column names if train data is Spark DataFrame or XShards of Pandas DataFrame.

  • label_cols – label column names if train data is Spark DataFrame or XShards of Pandas DataFrame.

  • auto_shard_files – whether to automatically detect if the dataset is file-based and and apply sharding on files, otherwise sharding on records. Default is False.

Returns

evaluation result as a dictionary of {‘metric name’: metric value}

save_tf_checkpoint(path: str) None[source]#

Save tensorflow checkpoint in this estimator.

Parameters

path – tensorflow checkpoint path.

load_tf_checkpoint(path: str) None[source]#

Load tensorflow checkpoint to this estimator. :param path: tensorflow checkpoint path.

get_model()[source]#

Get_model is not supported in tensorflow graph estimator

save(model_path: str) None[source]#

Save model (tensorflow checkpoint) to model_path

Parameters

model_path – path to save the trained model.

Returns

load(model_path: str) None[source]#

Load existing model (tensorflow checkpoint) from model_path :param model_path: Path to the existing tensorflow checkpoint. :return:

clear_gradient_clipping()[source]#

Clear gradient clipping is not supported in TensorFlowEstimator.

set_constant_gradient_clipping(min, max)[source]#

Set constant gradient clipping is not supported in TensorFlowEstimator. Please pass the clip_value to Estimator.from_graph.

set_l2_norm_gradient_clipping(clip_norm)[source]#

Set l2 norm gradient clipping is not supported in TensorFlowEstimator. Please pass the clip_norm to Estimator.from_graph.

shutdown() None[source]#

Close TensorFlow session and release resources.

class bigdl.orca.learn.tf.estimator.KerasEstimator(keras_model: Model, metrics: Optional[Metric], model_dir: Optional[str], optimizer: Optional[Optimizer])[source]#

Bases: bigdl.orca.learn.tf.estimator.Estimator

fit(data: Union[SparkXShards, DataFrame, tf.data.Dataset], epochs: int = 1, batch_size: int = 32, feature_cols: Optional[List[str]] = None, label_cols: Optional[List[str]] = None, validation_data: Optional[Union[SparkXShards, DataFrame, tf.data.Dataset]] = None, session_config: Optional[ConfigProto] = None, checkpoint_trigger: Optional[bigdl.orca.learn.trigger.Trigger] = None, auto_shard_files: bool = False) bigdl.orca.learn.tf.estimator.Estimator[source]#

Train this keras model with train data.

Parameters
  • data – train data. It can be XShards, Spark DataFrame, tf.data.Dataset. If data is XShards, each partition can be a Pandas DataFrame or a dictionary of {‘x’: feature, ‘y’: label}, where feature(label) is a numpy array or a tuple of numpy arrays. If data is tf.data.Dataset, each element is [feature tensor tuple, label tensor tuple]

  • epochs – number of epochs to train.

  • batch_size – total batch size for each iteration.

  • feature_cols – feature column names if train data is Spark DataFrame or XShards of Pandas DataFrame.

  • label_cols – label column names if train data is Spark DataFrame or XShards of Pandas DataFrame.

  • validation_data – validation data. Validation data type should be the same as train data.

  • session_config – tensorflow session configuration for training. Should be object of tf.ConfigProto

  • checkpoint_trigger – when to trigger checkpoint during training. Should be a bigdl.orca.learn.trigger, like EveryEpoch(), SeveralIteration( num_iterations),etc.

  • auto_shard_files – whether to automatically detect if the dataset is file-based and and apply sharding on files, otherwise sharding on records. Default is False.

predict(data: Union[bigdl.orca.data.shard.SparkXShards, pyspark.sql.dataframe.DataFrame], batch_size: int = 4, feature_cols: Optional[List[str]] = None, auto_shard_files: bool = False) Union[bigdl.orca.data.shard.SparkXShards, pyspark.sql.dataframe.DataFrame][source]#

Predict input data

Parameters
  • data – data to be predicted. It can be XShards, Spark DataFrame, or tf.data.Dataset. If data is XShards, each partition can be a Pandas DataFrame or a dictionary of {‘x’: feature}, where feature is a numpy array or a tuple of numpy arrays. If data is tf.data.Dataset, each element is feature tensor tuple

  • batch_size – batch size per thread

  • feature_cols – list of feature column names if input data is Spark DataFrame or XShards of Pandas DataFrame.

  • auto_shard_files – whether to automatically detect if the dataset is file-based and and apply sharding on files, otherwise sharding on records. Default is False.

Returns

predicted result. If input data is XShards or tf.data.Dataset, the predict result is also a XShards, and the schema for each result is: {‘prediction’: predicted numpy array or list of predicted numpy arrays}. If input data is Spark DataFrame, the predict result is a DataFrame which includes original columns plus ‘prediction’ column. The ‘prediction’ column can be FloatType, VectorUDT or Array of VectorUDT depending on model outputs shape.

evaluate(data: Union[bigdl.orca.data.shard.SparkXShards, pyspark.sql.dataframe.DataFrame, tensorflow.python.data.ops.dataset_ops.DatasetV1], batch_size: int = 32, feature_cols: Optional[List[str]] = None, label_cols: Optional[List[str]] = None, auto_shard_files: bool = False) Dict[str, float][source]#

Evaluate model.

Parameters
  • data – evaluation data. It can be XShards, Spark DataFrame, tf.data.Dataset. If data is XShards, each partition can be a Pandas DataFrame or a dictionary of {‘x’: feature, ‘y’: label}, where feature(label) is a numpy array or a tuple of numpy arrays. If data is tf.data.Dataset, each element is [feature tensor tuple, label tensor tuple]

  • batch_size – batch size per thread.

  • feature_cols – feature_cols: feature column names if train data is Spark DataFrame or XShards of Pandas DataFrame.

  • label_cols – label column names if train data is Spark DataFrame or XShards of Pandas DataFrame.

  • auto_shard_files – whether to automatically detect if the dataset is file-based and and apply sharding on files, otherwise sharding on records. Default is False.

Returns

evaluation result as a dictionary of {‘metric name’: metric value}

save_keras_model(path: str, overwrite: bool = True) None[source]#

Save tensorflow keras model in this estimator.

Parameters
  • path – keras model save path.

  • overwrite – Whether to silently overwrite any existing file at the target location.

get_model() Model[source]#

Get the trained Keras model

Returns

The trained Keras model

save(model_path: str, overwrite: bool = True) None[source]#

Save model to model_path

Parameters
  • model_path – path to save the trained model.

  • overwrite – Whether to silently overwrite any existing file at the target location.

Returns

load(model_path: str) None[source]#

Load existing keras model

Parameters

model_path – Path to the existing keras model.

Returns

clear_gradient_clipping() None[source]#

Clear gradient clipping parameters. In this case, gradient clipping will not be applied. In order to take effect, it needs to be called before fit.

Returns

set_constant_gradient_clipping(min: float, max: float) None[source]#

Set constant gradient clipping during the training process. In order to take effect, it needs to be called before fit.

Parameters
  • min – The minimum value to clip by.

  • max – The maximum value to clip by.

Returns

set_l2_norm_gradient_clipping(clip_norm: float) None[source]#

Clip gradient to a maximum L2-Norm during the training process. In order to take effect, it needs to be called before fit.

Parameters

clip_norm – Gradient L2-Norm threshold.

Returns

save_keras_weights(filepath: str, overwrite: bool = True, save_format: Optional[str] = None) None[source]#

Save tensorflow keras model weights in this estimator.

Parameters
  • filepath – keras model weights save path.

  • overwrite – Whether to silently overwrite any existing file at the target location.

  • save_format – Either ‘tf’ or ‘h5’. A filepath ending in ‘.h5’ or ‘.keras’ will default to HDF5 if save_format is None. Otherwise None defaults to ‘tf’.

load_keras_weights(filepath: str, by_name: bool = False) None[source]#

Save tensorflow keras model in this estimator.

Parameters
  • filepath – keras model weights save path.

  • by_name – Boolean, whether to load weights by name or by topological order. Only topological loading is supported for weight files in TensorFlow format.

orca.learn.tf2.estimator#

class bigdl.orca.learn.tf2.estimator.Estimator[source]#

Bases: object

static from_keras(*, model_creator: Optional[Callable] = None, config: Optional[Dict] = None, verbose: bool = False, workers_per_node: int = 1, compile_args_creator: Optional[Callable] = None, backend: str = 'ray', cpu_binding: bool = False, log_to_driver: bool = True, model_dir: Optional[str] = None, **kwargs) Optional[Union[TensorFlow2Estimator, SparkTFEstimator]][source]#

Create an Estimator for tensorflow 2.

Parameters
  • model_creator – (dict -> Model) This function takes in the config dict and returns a compiled TF model.

  • config – (dict) configuration passed to ‘model_creator’, ‘data_creator’. Also contains fit_config, which is passed into model.fit(data, **fit_config) and evaluate_config which is passed into model.evaluate.

  • verbose – (bool) Prints output of one model if true.

  • workers_per_node – (Int) worker number on each node. default: 1.

  • compile_args_creator – (dict -> dict of loss, optimizer and metrics) Only used when the backend=”horovod”. This function takes in the config dict and returns a dictionary like {“optimizer”: tf.keras.optimizers.SGD(lr), “loss”: “mean_squared_error”, “metrics”: [“mean_squared_error”]}

  • backend – (string) You can choose “horovod”, “ray” or “spark” as backend. Default: ray.

  • cpu_binding – (bool) Whether to binds threads to specific CPUs. Default: False

  • log_to_driver – (bool) Whether display executor log on driver in cluster mode. Default: True. This option is only for “spark” backend.

  • model_dir – (str) The directory to save model states. It is required for “spark”

backend. For cluster mode, it should be a share filesystem path which can be accessed by executors.

static latest_checkpoint(checkpoint_dir: str) str[source]#
bigdl.orca.learn.tf2.estimator.make_data_creator(refs: Any) Callable[source]#
bigdl.orca.learn.tf2.estimator.data_length(data)[source]#

orca.learn.tf2.tf2_ray_estimator#

Orca TF2Estimator with backend of “horovod” or “ray”.

class bigdl.orca.learn.tf2.ray_estimator.TensorFlow2Estimator(model_creator: Optional[Callable] = None, compile_args_creator: Optional[Callable] = None, config: Optional[Dict] = None, verbose: bool = False, backend: str = 'ray', workers_per_node: int = 1, cpu_binding: bool = False)[source]#

Bases: bigdl.orca.learn.ray_estimator.Estimator

fit(data: Union[SparkXShards, SparkDataFrame, TFDataset, ray.data.Dataset, Callable], epochs: int = 1, batch_size: int = 32, verbose: Union[str, int] = 1, callbacks: Optional[List[Callback]] = None, validation_data: Optional[Union[SparkXShards, SparkDataFrame, TFDataset, ray.data.Dataset, Callable]] = None, class_weight: Optional[Dict[int, float]] = None, initial_epoch: int = 0, steps_per_epoch: Optional[int] = None, validation_steps: Optional[int] = None, validation_freq: int = 1, data_config: Optional[Dict] = None, feature_cols: Optional[List[str]] = None, label_cols: Optional[List[str]] = None) Union[List[Dict[str, float]], Dict[str, float]][source]#

Train this tensorflow model with train data.

Parameters
  • data – train data. It can be XShards, Spark DataFrame, Ray Dataset or creator function which returns Iter or DataLoader. If data is XShards, each partition can be a Pandas DataFrame or a dictionary of {‘x’: feature, ‘y’: label}, where feature(label) is a numpy array or a tuple of numpy arrays.

  • epochs – Number of epochs to train the model. Default: 1.

  • batch_size – Total batch size for all workers used for training. Each worker’s batch size would be this value divide the total number of workers. Default: 32.

  • verbose – Prints output of one model if true.

  • callbacks – List of Keras compatible callbacks to apply during training.

  • validation_data – validation data. Validation data type should be the same as train data.

  • class_weight – Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function. This can be useful to tell the model to “pay more attention” to samples from an under-represented class.

  • steps_per_epoch – Total number of steps (batches of samples) before declaring one epoch finished and starting the next epoch. If steps_pre_epoch is None, the epoch will run until the input dataset is exhausted. When passing an infinitely repeating dataset, you must specify the step_per_epoch argument.

  • validation_steps – Total number of steps (batches of samples) to draw before stopping when performing validation at the end of every epoch. Default: None.

  • validation_freq – Only relevant if validation data is provided. Integer of collections_abc.Container instance (e.g. list, tuple, etc.). If an integer, specifies how many training epochs to run before a new validation run is performed, e.g. validation_freq=2 runs validation every 2 epochs. If a Container, specifies the epochs on which to run validation, e.g. validation_freq=[1, 2, 10] runs validation at the end of the 1st, 2nd, and 10th epochs.

  • data_config – An optional dictionary that can be passed to data creator function. If data is a Ray Dataset, specifies output_signature same as in tf.data.Dataset.from_generator (If label_cols is specified, a 2-element tuple of tf.TypeSpec objects corresponding to (features, label). Otherwise, a single tf.TypeSpec corresponding to features tensor).

  • feature_cols – Feature column name(s) of data. Only used when data is a Spark DataFrame, an XShards of Pandas DataFrame or a Ray Dataset. Default: None.

  • label_cols – Label column name(s) of data. Only used when data is a Spark DataFrame, an XShards of Pandas DataFrame or a Ray Dataset. Default: None.

Returns

evaluate(data: Union[SparkXShards, SparkDataFrame, TFDataset, ray.data.Dataset, Callable], batch_size: int = 32, num_steps: Optional[int] = None, verbose: Union[str, int] = 1, sample_weight: Optional[np.ndarray] = None, callbacks: Optional[List[Callback]] = None, data_config: Optional[Dict] = None, feature_cols: Optional[List[str]] = None, label_cols: Optional[List[str]] = None) Dict[source]#

Evaluates the model on the validation data set.

Parameters
  • data – evaluate data. It can be XShards, Spark DataFrame, Ray Dataset or creator function which returns Iter or DataLoader. If data is XShards, each partition can be a Pandas DataFrame or a dictionary of {‘x’: feature, ‘y’: label}, where feature(label) is a numpy array or a tuple of numpy arrays.

  • batch_size – Total batch size for all workers used for evaluation. Each worker’s batch size would be this value divide the total number of workers. Default: 32.

  • num_steps – Total number of steps (batches of samples) before declaring the evaluation round finished. Ignored with the default value of None.

  • verbose – Prints output of one model if true.

  • sample_weight – Optional Numpy array of weights for the training samples, used for weighting the loss function. You can either pass a flat (1D) Numpy array with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data, you can pass a 2D array with shape (samples, sequence_length), to apply a different weight to every timestep of every sample.

  • callbacks – List of Keras compatible callbacks to apply during evaluation.

  • data_config – An optional dictionary that can be passed to data creator function. If data is a Ray Dataset, specifies output_signature same as in tf.data.Dataset.from_generator (If label_cols is specified, a 2-element tuple of tf.TypeSpec objects corresponding to (features, label). Otherwise, a single tf.TypeSpec corresponding to features tensor).

  • feature_cols – Feature column name(s) of data. Only used when data is a Spark DataFrame, an XShards of Pandas DataFrame or a Ray Dataset. Default: None.

  • label_cols – Label column name(s) of data. Only used when data is a Spark DataFrame, an XShards of Pandas DataFrame or a Ray Dataset. Default: None.

Returns

validation result

process_ray_dataset(shard, label_cols, feature_cols, data_config)[source]#
predict(data: Union[SparkXShards, SparkDataFrame, TFDataset], batch_size: Optional[int] = 32, verbose: Union[str, int] = 1, steps: Optional[int] = None, callbacks: Optional[List[Callback]] = None, data_config: Optional[Dict] = None, feature_cols: Optional[List[str]] = None, min_partition_num: Optional[int] = None, output_cols: Optional[List[str]] = None) Union[SparkXShards, SparkDataFrame][source]#

Predict the input data

Parameters
  • data – predict input data. It can be XShards, Spark DataFrame or orca.data.tf.data.Dataset. If data is XShards, each partition can be a Pandas DataFrame or a dictionary of {‘x’: feature}, where feature is a numpy array or a tuple of numpy arrays.

  • batch_size – Total batch size for all workers used for evaluation. Each worker’s batch size would be this value divide the total number of workers. Default: 32.

  • verbose – Prints output of one model if true.

  • steps – Total number of steps (batches of samples) before declaring the prediction round finished. Ignored with the default value of None.

  • callbacks – List of Keras compatible callbacks to apply during prediction.

  • data_config – An optional dictionary that can be passed to data creator function.

  • feature_cols – Feature column name(s) of data. Only used when data is a Spark DataFrame or an XShards of Pandas DataFrame. Default: None.

  • min_partition_num – Int. An optional param for repartition the input data when data is an orca.data.tf.data.Dataset. If min_partition_num != None, the input data will be repartitioned to max(min_partition_num, worker_num) partitions. This parameter is usually used to improve the prediction performance when the model is a customized Keras model, and the number of input partitions is significantly larger than the number of workers. Note that if you set this parameter, the order of the prediction results is not guaranteed to be the same as the input order, so you need to add id information to the input to identify the corresponding prediction results. Default: None.

  • output_cols – Column name(s) of the model output data. Only used when data is a Spark DataFrame, note the order of column name(s) should be consistent with the model output data. Default: None.

Returns

get_model(sample_input: Optional[Tensor] = None) Model[source]#

Returns the learned model.

Returns

the learned model.

save_checkpoint(checkpoint: str) str[source]#

Saves the model at the provided checkpoint.

Parameters

checkpoint – (str) Path to the target checkpoint file.

load_checkpoint(checkpoint: str, **kwargs) None[source]#

Loads the model from the provided checkpoint.

Parameters

checkpoint – (str) Path to target checkpoint file.

save(filepath: str, overwrite: bool = True, include_optimizer: bool = True, save_format: Optional[str] = None, signatures: Optional[str] = None, options: Optional[SaveOptions] = None) None[source]#

Saves the model to Tensorflow SavedModel or a single HDF5 file.

Parameters
  • filepath – String, PathLike, path to SavedModel or H5 file to save the model. It can be local/hdfs/s3 filepath

  • overwrite – Whether to silently overwrite any existing file at the target location, or provide the user with a manual prompt.

  • include_optimizer – If True, save optimizer’s state together.

  • save_format – Either ‘tf’ or ‘h5’, indicating whether to save the model to Tensorflow SavedModel or HDF5. Defaults to ‘tf’ in TF 2.X, and ‘h5’ in TF 1.X.

  • signatures – Signatures to save with the SavedModel. Applicable to the ‘tf’ format only. Please see the signatures argument in tf.saved_model.save for details.

  • options – (only applies to SavedModel format) tf.saved_model.SaveOptions object that specifies options for saving to SavedModel.

load(filepath: str, custom_objects: Optional[Dict] = None, compile: bool = True, options: Optional[SaveOptions] = None) None[source]#

Loads a model saved via `estimator.save()

Parameters
  • filepath – (str) Path of saved model (SavedModel or H5 file). It can be local/hdfs filepath

  • custom_objects – Optional dictionary mapping names (strings) to custom classes or functions to be considered during deserialization.

  • compile – Boolean, whether to compile the model after loading.

  • options – Optional tf.saved_model.LoadOptions object that specifies options for loading from SavedModel.

save_weights(filepath: str, overwrite: bool = True, save_format: Optional[str] = None, options: Optional[SaveOptions] = None) None[source]#

Save the model weights at the provided filepath. param filepath: String or PathLike, path to the file to save the weights to.

When saving in TensorFlow format, this is the prefix used for checkpoint files (multiple files are generated). Note that the ‘.h5’ suffix causes weights to be saved in HDF5 format.

param overwrite: Whether to silently overwrite any existing file at the target location,

or provide the user with a manual prompt.

param save_format: Either ‘tf’ or ‘h5’.

A filepath ending in ‘.h5’ or ‘.keras’ will default to HDF5 if save_format is None. Otherwise None defaults to ‘tf’.

param options: Optional tf.train.CheckpointOptions object that specifies options for saving

weights.

Returns

load_weights(filepath=<class 'str'>, by_name: bool = False, skip_mismatch: bool = False, options: typing.Optional[SaveOptions] = None) None[source]#

Load tensorflow keras model weights from the provided path. param filepath: String, path to the weights file to load. For weight files in TensorFlow

format, this is the file prefix (the same as was passed to save_weights). This can also be a path to a SavedModel saved from model.save.

param by_name: Boolean, whether to load weights by name or by topological order.

Only topological loading is supported for weight files in TensorFlow format.

param skip_mismatch: Boolean, whether to skip loading of layers where there is a mismatch

in the number of weights, or a mismatch in the shape of the weight (only valid when by_name=True).

param options: Optional tf.train.CheckpointOptions object that specifies options for loading

weights.

Returns

shutdown() None[source]#

Shuts down workers and releases resources.

orca.learn.tf2.tf2_spark_estimator#

Orca TF2Estimator with backend of “spark”.

class bigdl.orca.learn.tf2.pyspark_estimator.SparkTFEstimator(model_creator: Optional[Callable] = None, config: Optional[Dict] = None, compile_args_creator: Optional[Callable] = None, verbose: bool = False, workers_per_node: int = 1, model_dir: Optional[str] = None, log_to_driver: bool = True, **kwargs)[source]#

Bases: object

fit(data: Union[SparkXShards, SparkDataFrame, Callable], epochs: int = 1, batch_size: int = 32, verbose: Union[str, int] = 1, callbacks: Optional[List[Callback]] = None, validation_data: Optional[Union[SparkXShards, SparkDataFrame, Callable]] = None, class_weight: Optional[Dict[int, float]] = None, initial_epoch: int = 0, steps_per_epoch: Optional[int] = None, validation_steps: Optional[int] = None, validation_freq: int = 1, data_config: Optional[Dict] = None, feature_cols: Optional[List[str]] = None, label_cols: Optional[List[str]] = None) Dict[source]#

Train this tensorflow model with train data. :param data: train data. It can be XShards, Spark DataFrame or creator function which

returns Iter or DataLoader. If data is XShards, each partition can be a Pandas DataFrame or a dictionary of {‘x’: feature, ‘y’: label}, where feature(label) is a numpy array or a tuple of numpy arrays.

Parameters
  • epochs – Number of epochs to train the model. Default: 1.

  • batch_size – Total batch size for all workers used for training. Each worker’s batch size would be this value divide the total number of workers. Default: 32.

  • verbose – Prints output of one model if true.

  • callbacks – List of Keras compatible callbacks to apply during training.

  • validation_data – validation data. Validation data type should be the same as train data.

  • class_weight – Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function. This can be useful to tell the model to “pay more attention” to samples from an under-represented class.

Returns

evaluate(data: Union[SparkXShards, SparkDataFrame, Callable], batch_size: int = 32, num_steps: Optional[int] = None, verbose: Union[str, int] = 1, sample_weight: Optional[np.ndarray] = None, callbacks: Optional[List[Callback]] = None, data_config: Optional[Dict] = None, feature_cols: Optional[List[str]] = None, label_cols: Optional[List[str]] = None) Dict[source]#

Evaluates the model on the validation data set. :param data: evaluate data. It can be XShards, Spark DataFrame or creator function which

returns Iter or DataLoader. If data is XShards, each partition can be a Pandas DataFrame or a dictionary of {‘x’: feature, ‘y’: label}, where feature(label) is a numpy array or a tuple of numpy arrays.

Parameters
  • batch_size – Total batch size for all workers used for evaluation. Each worker’s batch size would be this value divide the total number of workers. Default: 32.

  • verbose – Prints output of one model if true.

  • callbacks – List of Keras compatible callbacks to apply during evaluation.

Returns

validation result

predict(data: Union[SparkXShards, SparkDataFrame], batch_size: Optional[int] = 32, verbose: Union[str, int] = 1, steps: Optional[int] = None, callbacks: Optional[List[Callback]] = None, data_config: Optional[Dict] = None, feature_cols: Optional[List[str]] = None, output_cols: Optional[List[str]] = None) Union[SparkXShards, SparkDataFrame][source]#

Predict the input data :param data: predict input data. It can be XShards or Spark DataFrame.

If data is XShards, each partition can be a Pandas DataFrame or a dictionary of {‘x’: feature}, where feature is a numpy array or a tuple of numpy arrays.

Parameters
  • batch_size – Total batch size for all workers used for evaluation. Each worker’s batch size would be this value divide the total number of workers. Default: 32.

  • verbose – Prints output of one model if true.

  • steps – Total number of steps (batches of samples) before declaring the prediction round finished. Ignored with the default value of None.

  • callbacks – List of Keras compatible callbacks to apply during prediction.

  • data_config – An optional dictionary that can be passed to data creator function.

  • feature_cols – Feature column name(s) of data. Only used when data is a Spark DataFrame or an XShards of Pandas DataFrame. Default: None.

  • output_cols – Column name(s) of the model output data. Only used when data is a Spark DataFrame, note the order of column name(s) should be consistent with the model output data. Default: None.

Returns

save_weights(filepath: str, overwrite: bool = True, save_format: Optional[str] = None) None[source]#

Save model weights at the provided path. :param filepath: String or PathLike, path to the file to save the weights to. When saving in TensorFlow format, this is the prefix used for checkpoint files (multiple files are generated). Note that the ‘.h5’ suffix causes weights to be saved in HDF5 format. It can be local, hdfs, or s3 filepath. :param overwrite: Whether to silently overwrite any existing file at the target location, or provide the user with a manual prompt. :param save_format: Either ‘tf’ or ‘h5’. A filepath ending in ‘.h5’ or ‘.keras’ will default to HDF5 if save_format is None. Otherwise None defaults to ‘tf’.

load_weights(filepath: str, by_name: bool = False) None[source]#

Load tensorflow keras model weights in this estimator.

Parameters
  • filepath – keras model weights save path.

  • by_name – Boolean, whether to load weights by name or by topological order. Only topological loading is supported for weight files in TensorFlow format.

save(filepath: str, overwrite: bool = True, include_optimizer: bool = True, save_format: Optional[str] = None, signatures: Optional[str] = None, options: Optional[SaveOptions] = None) None[source]#

Saves the model to Tensorflow SavedModel or a single HDF5 file.

Parameters
  • filepath – String, PathLike, path to SavedModel or H5 file to save the model. It can be local/hdfs/s3 filepath

  • overwrite – Whether to silently overwrite any existing file at the target location, or provide the user with a manual prompt.

  • include_optimizer – If True, save optimizer’s state together.

  • save_format – Either ‘tf’ or ‘h5’, indicating whether to save the model to Tensorflow SavedModel or HDF5. Defaults to ‘tf’ in TF 2.X, and ‘h5’ in TF 1.X.

  • signatures – Signatures to save with the SavedModel. Applicable to the ‘tf’ format only. Please see the signatures argument in tf.saved_model.save for details.

  • options – (only applies to SavedModel format) tf.saved_model.SaveOptions object that specifies options for saving to SavedModel.

load(filepath: str, custom_objects: Optional[Dict] = None, compile: bool = True) None[source]#

Loads a model saved via `estimator.save()

Parameters
  • filepath – (str) Path of saved model.

  • custom_objects – Optional dictionary mapping names (strings) to custom classes or functions to be considered during deserialization.

  • compile – Boolean, whether to compile the model after loading.

  • options – Optional tf.saved_model.LoadOptions object that specifies

options for loading from SavedModel.

get_model(set_weights: bool = True) Model[source]#

Returns the learned model.

Returns

the learned model.

shutdown() None[source]#

Shutdown estimator and release resources.

orca.learn.pytorch.estimator#

class bigdl.orca.learn.pytorch.estimator.Estimator[source]#

Bases: object

static from_torch(*, model: Optional[Union[Module, Callable[[Dict], Module]]] = None, optimizer: Optional[Union[Optimizer, Callable[[Module, Dict], Optimizer]]] = None, loss: Optional[Union[Loss, Callable[[Dict], Loss]]] = None, metrics: Optional[Union[Metric, List[Metric]]] = None, backend: str = 'spark', config: Optional[Dict] = None, workers_per_node: int = 1, scheduler_creator: Optional[Callable[[Dict], LRScheduler]] = None, use_tqdm: bool = False, model_dir: Optional[str] = None, sync_stats: bool = False, log_level: int = 20, log_to_driver: bool = True) Optional[Union[PyTorchRayEstimator, PyTorchPySparkEstimator]][source]#

Create an Estimator for PyTorch.

Parameters
  • model – A model creator function that takes the parameter “config” and returns a PyTorch model.

  • optimizer – An optimizer creator function that has two parameters “model” and “config” and returns a PyTorch optimizer. Default: None if training is not performed.

  • loss – An instance of PyTorch loss. Default: None if loss computation is not needed.

  • metrics – One or a list of Orca validation metrics. Function(s) that computes the metrics between the output and target tensors are also supported. Default: None if no validation is involved.

  • backend – The distributed backend for the Estimator. One of “spark”, “ray” or “horovod”. Default: “spark”.

  • config – A parameter config dict, CfgNode or any class instance that plays a role of configuration to create model, loss, optimizer, scheduler and data. Default: None if no config is needed.

  • workers_per_node – The number of PyTorch workers on each node. Default: 1.

  • scheduler_creator – A scheduler creator function that has two parameters “optimizer” and “config” and returns a PyTorch learning rate scheduler wrapping the optimizer. By default a scheduler will take effect automatically every epoch. Default: None if no scheduler is needed.

  • use_tqdm – Whether to use tqdm to monitor the training progress. Default: False.

  • model_dir – The path to save the PyTorch model during the training if checkpoint_trigger is defined and triggered. Default: None.

  • sync_stats – Whether to sync metrics across all distributed workers after each epoch. If set to False, only the metrics of the worker with rank 0 are printed. Default: True

  • log_level – The log_level of each distributed worker. Default: logging.INFO.

  • log_to_driver – Whether to display executor log on driver in cluster mode for spark backend. Default: True.

Returns

A Estimator object for PyTorch.

static from_mmcv(*, mmcv_runner_creator: Callable, backend: str = 'ray', workers_per_node: int = 1, config: Optional[Dict] = None) Optional[MMCVRayEstimator][source]#

Create an Estimator for MMCV.

Parameters
  • mmcv_runner_creator – A runner creator function that takes the parameter “config” and returns a MMCV runner.

  • backend – The distributed backend for the Estimator. Default: “ray”.

  • config – A parameter config dict, CfgNode or any class instance that plays a role of configuration to create runner data. Default: None if no config is needed.

  • workers_per_node – The number of MMCV workers on each node. Default: 1.

Returns

A Estimator object for MMCV.

static latest_checkpoint(checkpoint_dir: str) str[source]#

orca.learn.pytorch.pytorch_ray_estimator#

Orca Pytorch Estimator with backend of “horovod” or “ray”.

class bigdl.orca.learn.pytorch.pytorch_ray_estimator.PyTorchRayEstimator(*, model_creator: Optional[Callable[[Dict], Module]], optimizer_creator: Optional[Callable[[Module, Dict], Optimizer]] = None, loss_creator: Optional[Union[Loss, Callable[[Dict], Loss]]] = None, metrics: Optional[Union[Metric, List[Metric]]] = None, scheduler_creator: Optional[Callable[[Dict], LRScheduler]] = None, config: Dict = None, use_tqdm: bool = False, backend: str = 'ray', workers_per_node: int = 1, sync_stats: bool = True, log_level: int = 20)[source]#

Bases: bigdl.orca.learn.pytorch.core.base_ray_estimator.BaseRayEstimator

fit(data: Union[SparkXShards, SparkDataFrame, RayDataset, Callable[[Dict, int], DataLoader]], epochs: int = 1, max_steps: Optional[int] = None, batch_size: int = 32, profile: bool = False, reduce_results: bool = True, feature_cols: Optional[List[str]] = None, label_cols: Optional[List[str]] = None, validation_data: Optional[Union[SparkXShards, SparkDataFrame, Callable[[Dict, int], DataLoader]]] = None, callbacks: Optional[List[Callback]] = None) List[source]#

Trains a PyTorch model given training data for several epochs. Calls TorchRunner.train_epoch() on N parallel workers simultaneously underneath the hood.

Parameters
  • data – An instance of SparkXShards, a Ray Dataset, a Spark DataFrame or a function that takes config and batch_size as argument and returns a PyTorch DataLoader for training.

  • epochs – The number of epochs to train the model. Default is 1.

  • max_steps – The max steps to train the model. Default is None. If max_steps > 0, epochs would be ignored.

  • batch_size – Total batch size for all workers used for training. Each worker’s batch size would be this value divide the total number of workers. Default is 32. If your training data is a function, you can set batch_size to be the input batch_size of the function for the PyTorch DataLoader.

  • profile – Boolean. Whether to return time stats for the training procedure. Default is False.

  • reduce_results – Boolean. Whether to average all metrics across all workers into one dict. If a metric is a non-numerical value, the one value will be randomly selected among the workers. If False, returns a list of dicts for all workers. Default is True.

  • feature_cols – feature column names if data is Spark DataFrame or Ray Dataset.

  • label_cols – label column names if data is Spark DataFrame or Ray Dataset.

  • validation_data – validation data. Validation data type should be the same as train data.

  • callbacks – A list for all callbacks. Note that only one MainCallback is allowed among all callbacks.

Returns

A list of dictionary of metrics for every training epoch. If reduce_results is False, this will return a nested list of metric dictionaries whose length will be equal to the total number of workers. You can also provide custom metrics by passing in a custom HookClass(after 2.2.0) when creating the Estimator.

predict(data: Union[SparkXShards, SparkDataFrame], batch_size: int = 32, feature_cols: Optional[List[str]] = None, profile: bool = False, callbacks: Optional[List[Callback]] = None) Union[SparkXShards, SparkDataFrame][source]#

Using this PyTorch model to make predictions on the data.

Parameters
  • data – An instance of SparkXShards, a Ray Dataset or a Spark DataFrame

  • batch_size – Total batch size for all workers used for inference. Each worker’s batch size would be this value divide the total number of workers. Default is 32.

  • profile – Boolean. Whether to return time stats for the training procedure. Default is False.

  • feature_cols – feature column names if data is a Spark DataFrame or Ray Dataset.

Returns

A SparkXShards or a list that contains the predictions with key “prediction” in each shard

evaluate(data: Union[SparkXShards, SparkDataFrame, RayDataset, Callable[[Dict, int], DataLoader]], batch_size: int = 32, num_steps: int = None, profile: bool = False, reduce_results: bool = True, feature_cols: Optional[List[str]] = None, label_cols: Optional[List[str]] = None, callbacks: Optional[List[Callback]] = None) Union[List[Dict], Dict][source]#

Evaluates a PyTorch model given validation data. Note that only accuracy for classification with zero-based label is supported by default. You can override validate_batch in TorchRunner for other metrics. Calls TorchRunner.validate() on N parallel workers simultaneously underneath the hood.

Parameters
  • data – An instance of SparkXShards, a Spark DataFrame, a Ray Dataset or a function that takes config and batch_size as argument and returns a PyTorch DataLoader for validation.

  • batch_size – Total batch size for all workers used for evaluation. Each worker’s batch size would be this value divide the total number of workers. Default: 32. If your validation data is a function, you can set batch_size to be the input batch_size of the function for the PyTorch DataLoader.

  • num_steps – The number of batches to compute the validation results on. This corresponds to the number of times TorchRunner.validate_batch is called.

  • profile – Boolean. Whether to return time stats for the training procedure. Default is False.

  • reduce_results – Boolean. Whether to average all metrics across all workers into one dict. If a metric is a non-numerical value, the one value will be randomly selected among the workers. If False, returns a list of dicts for all workers. Default is True.

  • feature_cols – feature column names if train data is Spark DataFrame or Ray Dataset.

  • label_cols – label column names if train data is Spark DataFrame or Ray Dataset.

  • callbacks – A list for all callbacks. Note that only one MainCallback is allowed among all callbacks.

Returns

A dictionary of metrics for the given data, including validation accuracy and loss. You can also provide custom metrics by passing in a custom HookClass(after 2.2.0) when creating the Estimator.

get_model() Module[source]#

Returns the learned PyTorch model.

Returns

The learned PyTorch model.

orca.learn.openvino.estimator#

class bigdl.orca.learn.openvino.estimator.Estimator[source]#

Bases: object

static from_openvino(*, model_path: str) bigdl.orca.learn.openvino.estimator.OpenvinoEstimator[source]#

Load an openVINO Estimator.

Parameters

model_path – String. The file path to the OpenVINO IR xml file.

class bigdl.orca.learn.openvino.estimator.OpenvinoEstimator(*, model_path: str)[source]#

Bases: bigdl.orca.learn.spark_estimator.Estimator

fit(data, epochs, batch_size=32, feature_cols=None, label_cols=None, validation_data=None, checkpoint_trigger=None)[source]#

Fit is not supported in OpenVINOEstimator

predict(data: Union[bigdl.orca.data.shard.SparkXShards, pyspark.sql.dataframe.DataFrame, numpy.ndarray, List[numpy.ndarray]], feature_cols: Optional[List[str]] = None, batch_size: Optional[int] = 4, input_cols: Optional[Union[str, List[str]]] = None, config: Optional[Dict] = None) Optional[Union[bigdl.orca.data.shard.SparkXShards, pyspark.sql.dataframe.DataFrame, numpy.ndarray, List[numpy.ndarray]]][source]#

Predict input data

Parameters
  • batch_size – Int. Set batch Size, default is 4.

  • data – data to be predicted. XShards, Spark DataFrame, numpy array and list of numpy arrays are supported. If data is XShards, each partition is a dictionary of {‘x’: feature}, where feature(label) is a numpy array or a list of numpy arrays.

  • feature_cols – Feature column name(s) of data. Only used when data is a Spark DataFrame. Default: None.

  • input_cols – Str or List of str. The model input list(order related). Users can specify the input order using the inputs parameter. If inputs=None, The default OpenVINO model input list will be used. Default: None.

Returns

predicted result. If the input data is XShards, the predict result is a XShards, each partition of the XShards is a dictionary of {‘prediction’: result}, where the result is a numpy array or a list of numpy arrays. If the input data is numpy arrays or list of numpy arrays, the predict result is a numpy array or a list of numpy arrays.

evaluate(data, batch_size=32, feature_cols=None, label_cols=None)[source]#

Evaluate is not supported in OpenVINOEstimator

get_model()[source]#

Get_model is not supported in OpenVINOEstimator

save(model_path: str)[source]#

Save is not supported in OpenVINOEstimator

load(model_path: str) None[source]#

Load an openVINO model.

Parameters

model_path – String. The file path to the OpenVINO IR xml file.

Returns

set_tensorboard(log_dir: str, app_name: str)[source]#

Set_tensorboard is not supported in OpenVINOEstimator

clear_gradient_clipping()[source]#

Clear_gradient_clipping is not supported in OpenVINOEstimator

set_constant_gradient_clipping(min, max)[source]#

Set_constant_gradient_clipping is not supported in OpenVINOEstimator

set_l2_norm_gradient_clipping(clip_norm)[source]#

Set_l2_norm_gradient_clipping is not supported in OpenVINOEstimator

get_train_summary(tag=None)[source]#

Get_train_summary is not supported in OpenVINOEstimator

get_validation_summary(tag=None)[source]#

Get_validation_summary is not supported in OpenVINOEstimator

load_orca_checkpoint(path, version)[source]#

Load_orca_checkpoint is not supported in OpenVINOEstimator

orca.learn.mpi.mpi_estimator#

class bigdl.orca.learn.mpi.MPIEstimator(model_creator, optimizer_creator, loss_creator, metrics=None, scheduler_creator=None, config=None, init_func=None, hosts=None, workers_per_node=1, env=None)[source]#

Bases: object

Create Orca MPI Estimator :param model_creator: A model creator function that takes the parameter “config”

and returns a model

Parameters
  • optimizer_creator – An optimizer creator function that has two parameters “model” and “config” and returns a optimizer.

  • loss_creator – An creater function to return a loss. Default: None if loss computation is not needed.

  • metrics – One or a list of validation metrics. Function(s) that computes the metrics between the output and target tensors are also supported.

  • scheduler_creator – A scheduler creator function that has two parameters “optimizer” and “config” and returns a learning rate scheduler wrapping the optimizer. By default a scheduler will take effect automatically every epoch. Default: None if no scheduler is needed.

  • config – A parameter config dict, that plays a role of configuration to create model, loss, optimizer, scheduler and data. Default: None if no config is needed.

  • init_func – A function takes the parameter “config” to init the distributed environment for MPI if any.

  • hosts – host information to be run distributedly. It can be None, ‘all’ or list of hostname/ip. If hosts is None, means it runs on single(self) node. If hosts is ‘all’, it will get executor hosts from current Spark Context. Default: None.

  • workers_per_node – The number of workers on each node.

  • env – Special environment should be passed to MPI environment.

fit(data, epochs=1, batch_size=32, validation_data=None, validate_batch_size=32, train_func=None, validate_func=None, train_batches=None, validate_batches=None, validate_steps=None, feature_cols=None, label_cols=None, mpi_options=None)[source]#

Run distributed training through MPI. :param data: An instance of a Spark DataFrame or a function

that takes config as argument and returns a PyTorch DataLoader for training.

Parameters
  • epochs – The number of epochs to train the model. Default is 1.

  • batch_size – Batch size on each workers used for training. Default is 32. If your training data is a function, you can set batch_size to be the input batch_size of the function for the PyTorch DataLoader.

  • validation_data – validation data. Validation data type should be the same as train data.

  • validate_batch_size – Each worker’s batch size for validation. Default is 32. If your training data is a function, you can set batch_size to be the input batch_size of the function for the PyTorch DataLoader

  • train_func – Specific training loop to take parameters “config”, “epochs”, “model”, “train_ld”, “train_batches”, “optimizer”, “loss”, “scheduler”, “validate_func”, “valid_ld”, “metrics”, “validate_batches” and “validate_steps”. Default: None to use our default training loop

  • validate_func – Specific validate function. Default: None to use our default validation function.

  • train_batches – Specify train_batches in case of unbalance data. Default: None to train the whole train data

  • validate_batches – Specify validate_batches in case of unbalance data. Default: None to validate the whole validation data

:param validate_steps:Specify validate_steps to validate periodically.

Note that validation would always be triggered at the end of an epoch.

Parameters
  • feature_cols – Specify the feature column names if data is Spark Dataframe

  • label_cols – Specify the label column names if data is Spark Dataframe

  • mpi_options – Specify str of addition mpi options.

Returns

shutdown()[source]#