BigDL-LLM Examples: CPU#
Here, we provide some examples on how you could apply BigDL-LLM INT4 optimizations on popular open-source models in the community.
To run these examples, please first refer to here for more information about how to install bigdl-llm, requirements and best practices for setting up your environment.
The following models have been verified on either servers or laptops with Intel CPUs.
Example of PyTorch API#
| Model | Example of PyTorch API |
|---|---|
| LLaMA 2 | link |
| ChatGLM | link |
| Mistral | link |
| Bark | link |
| BERT | link |
| Openai Whisper | link |
Important
In addition to INT4 optimization, BigDL-LLM also provides other low bit optimizations (such as INT8, INT5, NF4, etc.). You may apply other low bit optimizations through PyTorch API as example.
Example of transformers-style API#
| Model | Example of transformers-style API |
|---|---|
| LLaMA (such as Vicuna, Guanaco, Koala, Baize, WizardLM, etc.) | link1, link2 |
| LLaMA 2 | link |
| ChatGLM | link |
| ChatGLM2 | link |
| Mistral | link |
| Falcon | link |
| MPT | link |
| Dolly-v1 | link |
| Dolly-v2 | link |
| Replit Code | link |
| RedPajama | link1, link2 |
| Phoenix | link1, link2 |
| StarCoder | link1, link2 |
| Baichuan | link |
| Baichuan2 | link |
| InternLM | link |
| Qwen | link |
| Aquila | link |
| MOSS | link |
| Whisper | link |
Important
In addition to INT4 optimization, BigDL-LLM also provides other low bit optimizations (such as INT8, INT5, NF4, etc.). You may apply other low bit optimizations through transformers-style API as example.
See also
See the complete examples here.