BigDL-LLM LangChain API#
LLM Wrapper of LangChain#
Hugging Face transformers
Format#
BigDL-LLM provides TransformersLLM
and TransformersPipelineLLM
, which implement the standard interface of LLM wrapper of LangChain.
- class bigdl.llm.langchain.llms.transformersllm.TransformersLLM(*args: Any, **kwargs: Any)[source]#
Bases:
langchain.llms.base.LLM
Wrapper around the BigDL-LLM Transformer-INT4 model
Example
from bigdl.llm.langchain.llms import TransformersLLM llm = TransformersLLM.from_model_id(model_id="THUDM/chatglm-6b")
- classmethod from_model_id(model_id: str, model_kwargs: Optional[dict] = None, device_map: str = 'cpu', **kwargs: Any) langchain.llms.base.LLM [source]#
Construct object from model_id
- Parameters
model_id – Path for the huggingface repo id to be downloaded or the huggingface checkpoint folder.
model_kwargs – Keyword arguments that will be passed to the model and tokenizer.
kwargs – Extra arguments that will be passed to the model and tokenizer.
- Returns
An object of TransformersLLM.
- classmethod from_model_id_low_bit(model_id: str, model_kwargs: Optional[dict] = None, device_map: str = 'cpu', **kwargs: Any) langchain.llms.base.LLM [source]#
Construct low_bit object from model_id
- Parameters
model_id – Path for the bigdl transformers low-bit model checkpoint folder.
model_kwargs – Keyword arguments that will be passed to the model and tokenizer.
kwargs – Extra arguments that will be passed to the model and tokenizer.
- Returns
An object of TransformersLLM.
- class bigdl.llm.langchain.llms.transformerspipelinellm.TransformersPipelineLLM(*args: Any, **kwargs: Any)[source]#
Bases:
langchain.llms.base.LLM
Wrapper around the BigDL-LLM Transformer-INT4 model in Transformer.pipeline()
Example
from bigdl.llm.langchain.llms import TransformersPipelineLLM llm = TransformersPipelineLLM.from_model_id(model_id="decapoda-research/llama-7b-hf")
Native Model#
For llama
/chatglm
/bloom
/gptneox
/starcoder
model families, you could also use the following LLM wrappers with the native (cpp) implementation for maximum performance.
- class bigdl.llm.langchain.llms.bigdlllm.LlamaLLM(*args: Any, **kwargs: Any)[source]#
Bases:
bigdl.llm.langchain.llms.bigdlllm._BaseCausalLM
- validate_environment(values: Dict) Dict #
Validate that bigdl-llm is installed, family is supported
- stream(prompt: str, stop: Optional[List[str]] = None, run_manager: Optional[langchain.callbacks.manager.CallbackManagerForLLMRun] = None) Generator[Dict, None, None] #
Yields results objects as they are generated in real time.
BETA: this is a beta feature while we figure out the right abstraction. Once that happens, this interface could change.
It also calls the callback manager’s on_llm_new_token event with similar parameters to the OpenAI LLM class method of the same name.
- Parameters
prompt – The prompts to pass into the model.
stop – Optional list of stop words to use when generating.
- Returns
A generator representing the stream of tokens being generated.
- Yields
A dictionary like objects containing a string token and metadata. See llama-cpp-python docs and below for more.
Example
from bigdl.llm.langchain.llms import LlamaLLM llm = LlamaLLM( model_path="/path/to/local/model.bin", temperature = 0.5 ) for chunk in llm.stream("Ask 'Hi, how are you?' like a pirate:'", stop=["'","\n"]): result = chunk["choices"][0] print(result["text"], end='', flush=True)
- get_num_tokens(text: str) int #
Get the number of tokens that present in the text.
Useful for checking if an input will fit in a model’s context window.
- Parameters
text – The string input to tokenize.
- Returns
The number of tokens in the text.
- class bigdl.llm.langchain.llms.bigdlllm.ChatGLMLLM(*args: Any, **kwargs: Any)[source]#
Bases:
bigdl.llm.langchain.llms.bigdlllm._BaseCausalLM
- validate_environment(values: Dict) Dict #
Validate that bigdl-llm is installed, family is supported
- stream(prompt: str, stop: Optional[List[str]] = None, run_manager: Optional[langchain.callbacks.manager.CallbackManagerForLLMRun] = None) Generator[Dict, None, None] #
Yields results objects as they are generated in real time.
BETA: this is a beta feature while we figure out the right abstraction. Once that happens, this interface could change.
It also calls the callback manager’s on_llm_new_token event with similar parameters to the OpenAI LLM class method of the same name.
- Parameters
prompt – The prompts to pass into the model.
stop – Optional list of stop words to use when generating.
- Returns
A generator representing the stream of tokens being generated.
- Yields
A dictionary like objects containing a string token and metadata. See llama-cpp-python docs and below for more.
Example
from bigdl.llm.langchain.llms import LlamaLLM llm = LlamaLLM( model_path="/path/to/local/model.bin", temperature = 0.5 ) for chunk in llm.stream("Ask 'Hi, how are you?' like a pirate:'", stop=["'","\n"]): result = chunk["choices"][0] print(result["text"], end='', flush=True)
- get_num_tokens(text: str) int #
Get the number of tokens that present in the text.
Useful for checking if an input will fit in a model’s context window.
- Parameters
text – The string input to tokenize.
- Returns
The number of tokens in the text.
- class bigdl.llm.langchain.llms.bigdlllm.BloomLLM(*args: Any, **kwargs: Any)[source]#
Bases:
bigdl.llm.langchain.llms.bigdlllm._BaseCausalLM
- validate_environment(values: Dict) Dict #
Validate that bigdl-llm is installed, family is supported
- stream(prompt: str, stop: Optional[List[str]] = None, run_manager: Optional[langchain.callbacks.manager.CallbackManagerForLLMRun] = None) Generator[Dict, None, None] #
Yields results objects as they are generated in real time.
BETA: this is a beta feature while we figure out the right abstraction. Once that happens, this interface could change.
It also calls the callback manager’s on_llm_new_token event with similar parameters to the OpenAI LLM class method of the same name.
- Parameters
prompt – The prompts to pass into the model.
stop – Optional list of stop words to use when generating.
- Returns
A generator representing the stream of tokens being generated.
- Yields
A dictionary like objects containing a string token and metadata. See llama-cpp-python docs and below for more.
Example
from bigdl.llm.langchain.llms import LlamaLLM llm = LlamaLLM( model_path="/path/to/local/model.bin", temperature = 0.5 ) for chunk in llm.stream("Ask 'Hi, how are you?' like a pirate:'", stop=["'","\n"]): result = chunk["choices"][0] print(result["text"], end='', flush=True)
- get_num_tokens(text: str) int #
Get the number of tokens that present in the text.
Useful for checking if an input will fit in a model’s context window.
- Parameters
text – The string input to tokenize.
- Returns
The number of tokens in the text.
- class bigdl.llm.langchain.llms.bigdlllm.GptneoxLLM(*args: Any, **kwargs: Any)[source]#
Bases:
bigdl.llm.langchain.llms.bigdlllm._BaseCausalLM
- validate_environment(values: Dict) Dict #
Validate that bigdl-llm is installed, family is supported
- stream(prompt: str, stop: Optional[List[str]] = None, run_manager: Optional[langchain.callbacks.manager.CallbackManagerForLLMRun] = None) Generator[Dict, None, None] #
Yields results objects as they are generated in real time.
BETA: this is a beta feature while we figure out the right abstraction. Once that happens, this interface could change.
It also calls the callback manager’s on_llm_new_token event with similar parameters to the OpenAI LLM class method of the same name.
- Parameters
prompt – The prompts to pass into the model.
stop – Optional list of stop words to use when generating.
- Returns
A generator representing the stream of tokens being generated.
- Yields
A dictionary like objects containing a string token and metadata. See llama-cpp-python docs and below for more.
Example
from bigdl.llm.langchain.llms import LlamaLLM llm = LlamaLLM( model_path="/path/to/local/model.bin", temperature = 0.5 ) for chunk in llm.stream("Ask 'Hi, how are you?' like a pirate:'", stop=["'","\n"]): result = chunk["choices"][0] print(result["text"], end='', flush=True)
- get_num_tokens(text: str) int #
Get the number of tokens that present in the text.
Useful for checking if an input will fit in a model’s context window.
- Parameters
text – The string input to tokenize.
- Returns
The number of tokens in the text.
- class bigdl.llm.langchain.llms.bigdlllm.StarcoderLLM(*args: Any, **kwargs: Any)[source]#
Bases:
bigdl.llm.langchain.llms.bigdlllm._BaseCausalLM
- validate_environment(values: Dict) Dict #
Validate that bigdl-llm is installed, family is supported
- stream(prompt: str, stop: Optional[List[str]] = None, run_manager: Optional[langchain.callbacks.manager.CallbackManagerForLLMRun] = None) Generator[Dict, None, None] #
Yields results objects as they are generated in real time.
BETA: this is a beta feature while we figure out the right abstraction. Once that happens, this interface could change.
It also calls the callback manager’s on_llm_new_token event with similar parameters to the OpenAI LLM class method of the same name.
- Parameters
prompt – The prompts to pass into the model.
stop – Optional list of stop words to use when generating.
- Returns
A generator representing the stream of tokens being generated.
- Yields
A dictionary like objects containing a string token and metadata. See llama-cpp-python docs and below for more.
Example
from bigdl.llm.langchain.llms import LlamaLLM llm = LlamaLLM( model_path="/path/to/local/model.bin", temperature = 0.5 ) for chunk in llm.stream("Ask 'Hi, how are you?' like a pirate:'", stop=["'","\n"]): result = chunk["choices"][0] print(result["text"], end='', flush=True)
- get_num_tokens(text: str) int #
Get the number of tokens that present in the text.
Useful for checking if an input will fit in a model’s context window.
- Parameters
text – The string input to tokenize.
- Returns
The number of tokens in the text.
Embeddings Wrapper of LangChain#
Hugging Face transformers
AutoModel#
Wrapper around BigdlLLM embedding models.
- class bigdl.llm.langchain.embeddings.transformersembeddings.TransformersEmbeddings(*args: Any, **kwargs: Any)[source]#
Bases:
pydantic.BaseModel
,langchain.embeddings.base.Embeddings
Wrapper around bigdl-llm transformers embedding models.
To use, you should have the
transformers
python package installed.Example
from bigdl.llm.langchain.embeddings import TransformersEmbeddings embeddings = TransformersEmbeddings.from_model_id(model_id)
- classmethod from_model_id(model_id: str, model_kwargs: Optional[dict] = None, device_map: str = 'cpu', **kwargs: Any)[source]#
Construct object from model_id.
- Parameters
model_id – Path for the huggingface repo id to be downloaded or the huggingface checkpoint folder.
model_kwargs – Keyword arguments that will be passed to the model and tokenizer.
kwargs – Extra arguments that will be passed to the model and tokenizer.
- Returns
An object of TransformersEmbeddings.
- embed(text: str, **kwargs)[source]#
Compute doc embeddings using a HuggingFace transformer model.
- Parameters
texts – The list of texts to embed.
- Returns
List of embeddings, one for each text.
- class bigdl.llm.langchain.embeddings.transformersembeddings.TransformersBgeEmbeddings(*args: Any, **kwargs: Any)[source]#
Bases:
bigdl.llm.langchain.embeddings.transformersembeddings.TransformersEmbeddings
Native Model#
For llama
/bloom
/gptneox
/starcoder
model families, you could also use the following wrappers.
- class bigdl.llm.langchain.embeddings.bigdlllm.LlamaEmbeddings(*args: Any, **kwargs: Any)[source]#
Bases:
bigdl.llm.langchain.embeddings.bigdlllm._BaseEmbeddings
- validate_environment(values: Dict) Dict #
Validate that bigdl-llm library is installed.
- embed_documents(texts: List[str]) List[List[float]] #
Embed a list of documents using the optimized int4 model.
- Parameters
texts – The list of texts to embed.
- Returns
List of embeddings, one for each text.
- embed_query(text: str) List[float] #
Embed a query using the optimized int4 model.
- Parameters
text – The text to embed.
- Returns
Embeddings for the text.
- class bigdl.llm.langchain.embeddings.bigdlllm.BloomEmbeddings(*args: Any, **kwargs: Any)[source]#
Bases:
bigdl.llm.langchain.embeddings.bigdlllm._BaseEmbeddings
- validate_environment(values: Dict) Dict #
Validate that bigdl-llm library is installed.
- embed_documents(texts: List[str]) List[List[float]] #
Embed a list of documents using the optimized int4 model.
- Parameters
texts – The list of texts to embed.
- Returns
List of embeddings, one for each text.
- embed_query(text: str) List[float] #
Embed a query using the optimized int4 model.
- Parameters
text – The text to embed.
- Returns
Embeddings for the text.
- class bigdl.llm.langchain.embeddings.bigdlllm.GptneoxEmbeddings(*args: Any, **kwargs: Any)[source]#
Bases:
bigdl.llm.langchain.embeddings.bigdlllm._BaseEmbeddings
- validate_environment(values: Dict) Dict #
Validate that bigdl-llm library is installed.
- embed_documents(texts: List[str]) List[List[float]] #
Embed a list of documents using the optimized int4 model.
- Parameters
texts – The list of texts to embed.
- Returns
List of embeddings, one for each text.
- embed_query(text: str) List[float] #
Embed a query using the optimized int4 model.
- Parameters
text – The text to embed.
- Returns
Embeddings for the text.
- class bigdl.llm.langchain.embeddings.bigdlllm.StarcoderEmbeddings(*args: Any, **kwargs: Any)[source]#
Bases:
bigdl.llm.langchain.embeddings.bigdlllm._BaseEmbeddings
- validate_environment(values: Dict) Dict #
Validate that bigdl-llm library is installed.
- embed_documents(texts: List[str]) List[List[float]] #
Embed a list of documents using the optimized int4 model.
- Parameters
texts – The list of texts to embed.
- Returns
List of embeddings, one for each text.
- embed_query(text: str) List[float] #
Embed a query using the optimized int4 model.
- Parameters
text – The text to embed.
- Returns
Embeddings for the text.