BigDL DocumentationΒΆ


BigDL makes it easy for data scientists and data engineers to build end-to-end, distributed AI applications. The BigDL 2.0 release combines the original BigDL and Analytics Zoo projects, providing the following features:

  • DLlib: distributed deep learning library for Apache Spark

  • Orca: seamlessly scale out TensorFlow and PyTorch pipelines for distributed Big Data

  • RayOnSpark: run Ray programs directly on Big Data clusters

  • Chronos: scalable time series analysis using AutoML

  • PPML: privacy preserving big data analysis and machine learning (experimental)

  • Nano: automatically accelerate TensorFlow and PyTorch pipelines by applying modern CPU optimizations