BigDL-LLM LangChain API#

LLM Wrapper of LangChain#

Hugging Face `transformers` Format#

BigDL-LLM provides TransformersLLM and TransformersPipelineLLM, which implement the standard interface of LLM wrapper of LangChain.

class bigdl.llm.langchain.llms.transformersllm.TransformersLLM(*args: Any, **kwargs: Any)[source]#

Bases: langchain.llms.base.LLM

Wrapper around the BigDL-LLM Transformer-INT4 model

Example

from bigdl.llm.langchain.llms import TransformersLLM
llm = TransformersLLM.from_model_id(model_id="THUDM/chatglm-6b")

classmethod from_model_id(model_id: str, model_kwargs: Optional[dict] = None, device_map: str = 'cpu', **kwargs: Any) → langchain.llms.base.LLM[source]#

Construct object from model_id

Parameters

model_id – Path for the huggingface repo id to be downloaded or the huggingface checkpoint folder.
model_kwargs – Keyword arguments that will be passed to the model and tokenizer.
kwargs – Extra arguments that will be passed to the model and tokenizer.

Returns

An object of TransformersLLM.

classmethod from_model_id_low_bit(model_id: str, model_kwargs: Optional[dict] = None, device_map: str = 'cpu', **kwargs: Any) → langchain.llms.base.LLM[source]#

Construct low_bit object from model_id

Parameters

model_id – Path for the bigdl transformers low-bit model checkpoint folder.
model_kwargs – Keyword arguments that will be passed to the model and tokenizer.
kwargs – Extra arguments that will be passed to the model and tokenizer.

Returns

An object of TransformersLLM.

class bigdl.llm.langchain.llms.transformerspipelinellm.TransformersPipelineLLM(*args: Any, **kwargs: Any)[source]#

Bases: langchain.llms.base.LLM

Wrapper around the BigDL-LLM Transformer-INT4 model in Transformer.pipeline()

Example

from bigdl.llm.langchain.llms import TransformersPipelineLLM
llm = TransformersPipelineLLM.from_model_id(model_id="decapoda-research/llama-7b-hf")

classmethod from_model_id(model_id: str, task: str, model_kwargs: Optional[dict] = None, pipeline_kwargs: Optional[dict] = None, **kwargs: Any) → langchain.llms.base.LLM[source]#: Construct the pipeline object from model_id and task.

Native Model#

For llama/chatglm/bloom/gptneox/starcoder model families, you could also use the following LLM wrappers with the native (cpp) implementation for maximum performance.

class bigdl.llm.langchain.llms.bigdlllm.LlamaLLM(*args: Any, **kwargs: Any)[source]#

Bases: bigdl.llm.langchain.llms.bigdlllm._BaseCausalLM

validate_environment(values: Dict) → Dict#: Validate that bigdl-llm is installed, family is supported

stream(prompt: str, stop: Optional[List[str]] = None, run_manager: Optional[langchain.callbacks.manager.CallbackManagerForLLMRun] = None) → Generator[Dict, None, None]#

Yields results objects as they are generated in real time.

BETA: this is a beta feature while we figure out the right abstraction. Once that happens, this interface could change.

It also calls the callback manager’s on_llm_new_token event with similar parameters to the OpenAI LLM class method of the same name.

Parameters

prompt – The prompts to pass into the model.
stop – Optional list of stop words to use when generating.

Returns

A generator representing the stream of tokens being generated.

Yields

A dictionary like objects containing a string token and metadata. See llama-cpp-python docs and below for more.

Example

from bigdl.llm.langchain.llms import LlamaLLM
llm = LlamaLLM(
    model_path="/path/to/local/model.bin",
    temperature = 0.5
)
for chunk in llm.stream("Ask 'Hi, how are you?' like a pirate:'",
        stop=["'","\n"]):
    result = chunk["choices"][0]
    print(result["text"], end='', flush=True)

get_num_tokens(text: str) → int#

Get the number of tokens that present in the text.

Useful for checking if an input will fit in a model’s context window.

Parameters: text – The string input to tokenize.
Returns: The number of tokens in the text.

class bigdl.llm.langchain.llms.bigdlllm.ChatGLMLLM(*args: Any, **kwargs: Any)[source]#

Bases: bigdl.llm.langchain.llms.bigdlllm._BaseCausalLM

validate_environment(values: Dict) → Dict#: Validate that bigdl-llm is installed, family is supported

stream(prompt: str, stop: Optional[List[str]] = None, run_manager: Optional[langchain.callbacks.manager.CallbackManagerForLLMRun] = None) → Generator[Dict, None, None]#

Yields results objects as they are generated in real time.

BETA: this is a beta feature while we figure out the right abstraction. Once that happens, this interface could change.

It also calls the callback manager’s on_llm_new_token event with similar parameters to the OpenAI LLM class method of the same name.

Parameters

prompt – The prompts to pass into the model.
stop – Optional list of stop words to use when generating.

Returns

A generator representing the stream of tokens being generated.

Yields

A dictionary like objects containing a string token and metadata. See llama-cpp-python docs and below for more.

Example

from bigdl.llm.langchain.llms import LlamaLLM
llm = LlamaLLM(
    model_path="/path/to/local/model.bin",
    temperature = 0.5
)
for chunk in llm.stream("Ask 'Hi, how are you?' like a pirate:'",
        stop=["'","\n"]):
    result = chunk["choices"][0]
    print(result["text"], end='', flush=True)

get_num_tokens(text: str) → int#

Get the number of tokens that present in the text.

Useful for checking if an input will fit in a model’s context window.

Parameters: text – The string input to tokenize.
Returns: The number of tokens in the text.

class bigdl.llm.langchain.llms.bigdlllm.BloomLLM(*args: Any, **kwargs: Any)[source]#

Bases: bigdl.llm.langchain.llms.bigdlllm._BaseCausalLM

validate_environment(values: Dict) → Dict#: Validate that bigdl-llm is installed, family is supported

stream(prompt: str, stop: Optional[List[str]] = None, run_manager: Optional[langchain.callbacks.manager.CallbackManagerForLLMRun] = None) → Generator[Dict, None, None]#

Yields results objects as they are generated in real time.

BETA: this is a beta feature while we figure out the right abstraction. Once that happens, this interface could change.

It also calls the callback manager’s on_llm_new_token event with similar parameters to the OpenAI LLM class method of the same name.

Parameters

prompt – The prompts to pass into the model.
stop – Optional list of stop words to use when generating.

Returns

A generator representing the stream of tokens being generated.

Yields

A dictionary like objects containing a string token and metadata. See llama-cpp-python docs and below for more.

Example

from bigdl.llm.langchain.llms import LlamaLLM
llm = LlamaLLM(
    model_path="/path/to/local/model.bin",
    temperature = 0.5
)
for chunk in llm.stream("Ask 'Hi, how are you?' like a pirate:'",
        stop=["'","\n"]):
    result = chunk["choices"][0]
    print(result["text"], end='', flush=True)

get_num_tokens(text: str) → int#

Get the number of tokens that present in the text.

Useful for checking if an input will fit in a model’s context window.

Parameters: text – The string input to tokenize.
Returns: The number of tokens in the text.

class bigdl.llm.langchain.llms.bigdlllm.GptneoxLLM(*args: Any, **kwargs: Any)[source]#

Bases: bigdl.llm.langchain.llms.bigdlllm._BaseCausalLM

validate_environment(values: Dict) → Dict#: Validate that bigdl-llm is installed, family is supported

stream(prompt: str, stop: Optional[List[str]] = None, run_manager: Optional[langchain.callbacks.manager.CallbackManagerForLLMRun] = None) → Generator[Dict, None, None]#

Yields results objects as they are generated in real time.

BETA: this is a beta feature while we figure out the right abstraction. Once that happens, this interface could change.

It also calls the callback manager’s on_llm_new_token event with similar parameters to the OpenAI LLM class method of the same name.

Parameters

prompt – The prompts to pass into the model.
stop – Optional list of stop words to use when generating.

Returns

A generator representing the stream of tokens being generated.

Yields

A dictionary like objects containing a string token and metadata. See llama-cpp-python docs and below for more.

Example

from bigdl.llm.langchain.llms import LlamaLLM
llm = LlamaLLM(
    model_path="/path/to/local/model.bin",
    temperature = 0.5
)
for chunk in llm.stream("Ask 'Hi, how are you?' like a pirate:'",
        stop=["'","\n"]):
    result = chunk["choices"][0]
    print(result["text"], end='', flush=True)

get_num_tokens(text: str) → int#

Get the number of tokens that present in the text.

Useful for checking if an input will fit in a model’s context window.

Parameters: text – The string input to tokenize.
Returns: The number of tokens in the text.

class bigdl.llm.langchain.llms.bigdlllm.StarcoderLLM(*args: Any, **kwargs: Any)[source]#

Bases: bigdl.llm.langchain.llms.bigdlllm._BaseCausalLM

validate_environment(values: Dict) → Dict#: Validate that bigdl-llm is installed, family is supported

stream(prompt: str, stop: Optional[List[str]] = None, run_manager: Optional[langchain.callbacks.manager.CallbackManagerForLLMRun] = None) → Generator[Dict, None, None]#

Yields results objects as they are generated in real time.

BETA: this is a beta feature while we figure out the right abstraction. Once that happens, this interface could change.

It also calls the callback manager’s on_llm_new_token event with similar parameters to the OpenAI LLM class method of the same name.

Parameters

prompt – The prompts to pass into the model.
stop – Optional list of stop words to use when generating.

Returns

A generator representing the stream of tokens being generated.

Yields

A dictionary like objects containing a string token and metadata. See llama-cpp-python docs and below for more.

Example

from bigdl.llm.langchain.llms import LlamaLLM
llm = LlamaLLM(
    model_path="/path/to/local/model.bin",
    temperature = 0.5
)
for chunk in llm.stream("Ask 'Hi, how are you?' like a pirate:'",
        stop=["'","\n"]):
    result = chunk["choices"][0]
    print(result["text"], end='', flush=True)

get_num_tokens(text: str) → int#

Get the number of tokens that present in the text.

Useful for checking if an input will fit in a model’s context window.

Parameters: text – The string input to tokenize.
Returns: The number of tokens in the text.

Embeddings Wrapper of LangChain#

Hugging Face `transformers` AutoModel#

Wrapper around BigdlLLM embedding models.

class bigdl.llm.langchain.embeddings.transformersembeddings.TransformersEmbeddings(*args: Any, **kwargs: Any)[source]#

Bases: pydantic.BaseModel, langchain.embeddings.base.Embeddings

Wrapper around bigdl-llm transformers embedding models.

To use, you should have the transformers python package installed.

Example

from bigdl.llm.langchain.embeddings import TransformersEmbeddings
embeddings = TransformersEmbeddings.from_model_id(model_id)

classmethod from_model_id(model_id: str, model_kwargs: Optional[dict] = None, device_map: str = 'cpu', **kwargs: Any)[source]#

Construct object from model_id.

Parameters

model_id – Path for the huggingface repo id to be downloaded or the huggingface checkpoint folder.
model_kwargs – Keyword arguments that will be passed to the model and tokenizer.
kwargs – Extra arguments that will be passed to the model and tokenizer.

Returns

An object of TransformersEmbeddings.

embed(text: str, **kwargs)[source]#

Compute doc embeddings using a HuggingFace transformer model.

Parameters: texts – The list of texts to embed.
Returns: List of embeddings, one for each text.

embed_documents(texts: List[str]) → List[List[float]][source]#

Compute doc embeddings using a HuggingFace transformer model.

Parameters: texts – The list of texts to embed.
Returns: List of embeddings, one for each text.

embed_query(text: str) → List[float][source]#

Compute query embeddings using a bigdl-llm transformer model.

Parameters: text – The text to embed.
Returns: Embeddings for the text.

class bigdl.llm.langchain.embeddings.transformersembeddings.TransformersBgeEmbeddings(*args: Any, **kwargs: Any)[source]#

Bases: bigdl.llm.langchain.embeddings.transformersembeddings.TransformersEmbeddings

embed(text: str, **kwargs)[source]#

Compute doc embeddings using a HuggingFace transformer model.

Parameters: texts – The list of texts to embed.
Returns: List of embeddings, one for each text.

Native Model#

For llama/bloom/gptneox/starcoder model families, you could also use the following wrappers.

class bigdl.llm.langchain.embeddings.bigdlllm.LlamaEmbeddings(*args: Any, **kwargs: Any)[source]#

Bases: bigdl.llm.langchain.embeddings.bigdlllm._BaseEmbeddings

validate_environment(values: Dict) → Dict#: Validate that bigdl-llm library is installed.

embed_documents(texts: List[str]) → List[List[float]]#

Embed a list of documents using the optimized int4 model.

Parameters: texts – The list of texts to embed.
Returns: List of embeddings, one for each text.

embed_query(text: str) → List[float]#

Embed a query using the optimized int4 model.

Parameters: text – The text to embed.
Returns: Embeddings for the text.

class bigdl.llm.langchain.embeddings.bigdlllm.BloomEmbeddings(*args: Any, **kwargs: Any)[source]#

Bases: bigdl.llm.langchain.embeddings.bigdlllm._BaseEmbeddings

validate_environment(values: Dict) → Dict#: Validate that bigdl-llm library is installed.

embed_documents(texts: List[str]) → List[List[float]]#

Embed a list of documents using the optimized int4 model.

Parameters: texts – The list of texts to embed.
Returns: List of embeddings, one for each text.

embed_query(text: str) → List[float]#

Embed a query using the optimized int4 model.

Parameters: text – The text to embed.
Returns: Embeddings for the text.

class bigdl.llm.langchain.embeddings.bigdlllm.GptneoxEmbeddings(*args: Any, **kwargs: Any)[source]#

Bases: bigdl.llm.langchain.embeddings.bigdlllm._BaseEmbeddings

validate_environment(values: Dict) → Dict#: Validate that bigdl-llm library is installed.

embed_documents(texts: List[str]) → List[List[float]]#

Embed a list of documents using the optimized int4 model.

Parameters: texts – The list of texts to embed.
Returns: List of embeddings, one for each text.

embed_query(text: str) → List[float]#

Embed a query using the optimized int4 model.

Parameters: text – The text to embed.
Returns: Embeddings for the text.

class bigdl.llm.langchain.embeddings.bigdlllm.StarcoderEmbeddings(*args: Any, **kwargs: Any)[source]#

Bases: bigdl.llm.langchain.embeddings.bigdlllm._BaseEmbeddings

validate_environment(values: Dict) → Dict#: Validate that bigdl-llm library is installed.

embed_documents(texts: List[str]) → List[List[float]]#

Embed a list of documents using the optimized int4 model.

Parameters: texts – The list of texts to embed.
Returns: List of embeddings, one for each text.

embed_query(text: str) → List[float]#

Embed a query using the optimized int4 model.

Parameters: text – The text to embed.
Returns: Embeddings for the text.

BigDL-LLM LangChain API#

LLM Wrapper of LangChain#

Hugging Face transformers Format#

Native Model#

Embeddings Wrapper of LangChain#

Hugging Face transformers AutoModel#

Native Model#

Hugging Face `transformers` Format#

Hugging Face `transformers` AutoModel#