BigDL-Nano Features#

Feature	Meaning
Intel-openmp	Use Intel-openmp library to improve performance of multithread programs
Jemalloc	Use jemalloc as allocator
Tcmalloc	Use tcmalloc as allocator
Neural-Compressor	Neural-Compressor int8 quantization
OpenVINO	OpenVINO fp32/bf16/fp16/int8 acceleration on CPU/GPU/VPU
ONNXRuntime	ONNXRuntime fp32/int8 acceleration
CUDA patch	Run CUDA code even without GPU
JIT	PyTorch JIT optimization
Channel last	Channel last memory format
BF16	BFloat16 mixed precision training and inference
IPEX	Intel-extension-for-pytorch optimization
Multi-instance	Multi-process training and inference
ray	Use ray as multi-process backend

Common Feature Support (Can be used in both PyTorch and TensorFlow)#

Feature	Ubuntu (20.04/22.04)	CentOS7	MacOS (Intel chip)	MacOS (M-series chip)	Windows
Intel-openmp	✅	✅	✅	②	✅
Jemalloc	✅	✅	✅	❌	❌
Tcmalloc	✅	❌	❌	❌	❌
Neural-Compressor	✅	✅	❌	❌	?
OpenVINO	✅	①	❌	❌	④
ONNXRuntime	✅	①	✅	❌	✅
ray	✅	?	?	?	④

PyTorch Feature Support#

Feature	Ubuntu (20.04/22.04)	CentOS7	MacOS (Intel chip)	MacOS (M-series chip)	Windows
CUDA patch	✅	✅	✅	?	✅
JIT	✅	✅	✅	?	✅
Channel last	✅	✅	✅	?	✅
BF16	✅	✅	⭕	⭕	✅
IPEX	✅	✅	❌	❌	❌
Multi-instance	✅	✅	②	②	②

TensorFlow Feature Support#

Feature	Ubuntu (20.04/22.04)	CentOS7	MacOS (Intel chip)	MacOS (M-series chip)	Windows
BF16	✅	✅	⭕	⭕	✅
Multi-instance	③	③	②③	②③	❌

Symbol Meaning#

Symbol	Meaning
✅	Supported
❌	Not supported
⭕	All Mac machines (Intel/M-series chip) do not support bf16 instruction set, so this feature is pointless
①	This feature is only supported when used together with jemalloc
②	This feature is supported but without any performance guarantee
③	Only Multi-instance training is supported for now
④	This feature is only supported when using PyTorch
?	Not tested