BigDL-LLM Examples: GPU#
Here, we provide some examples on how you could apply BigDL-LLM INT4 optimizations on popular open-source models in the community.
To run these examples, please first refer to here for more information about how to install bigdl-llm
, requirements and best practices for setting up your environment.
Important
Only Linux system is supported now, Ubuntu 22.04 is prefered.
The following models have been verified on either servers or laptops with Intel GPUs.
Example of PyTorch API#
Model | Example of PyTorch API |
---|---|
LLaMA 2 | link |
ChatGLM 2 | link |
Mistral | link |
Baichuan | link |
Baichuan2 | link |
Replit | link |
StarCoder | link |
Dolly-v1 | link |
Dolly-v2 | link |
Important
In addition to INT4 optimization, BigDL-LLM also provides other low bit optimizations (such as INT8, INT5, NF4, etc.). You may apply other low bit optimizations through PyTorch API as example.
Example of transformers
-style API#
Model | Example of transformers -style API |
---|---|
LLaMA (such as Vicuna, Guanaco, Koala, Baize, WizardLM, etc.) | link |
LLaMA 2 | link |
ChatGLM2 | link |
Mistral | link |
Falcon | link |
MPT | link |
Dolly-v1 | link |
Dolly-v2 | link |
Replit | link |
StarCoder | link |
Baichuan | link |
Baichuan2 | link |
InternLM | link |
Qwen | link |
Aquila | link |
Whisper | link |
Chinese Llama2 | link |
GPT-J | link |
Important
In addition to INT4 optimization, BigDL-LLM also provides other low bit optimizations (such as INT8, INT5, NF4, etc.). You may apply other low bit optimizations through transformers
-style API as example.
See also
See the complete examples here.