BigDL-LLM LangChain API#

LLM Wrapper of LangChain#

Hugging Face transformers Format#

BigDL-LLM provides TransformersLLM and TransformersPipelineLLM, which implement the standard interface of LLM wrapper of LangChain.

class bigdl.llm.langchain.llms.transformersllm.TransformersLLM(*args: Any, **kwargs: Any)[source]#

Bases: langchain.llms.base.LLM

Wrapper around the BigDL-LLM Transformer-INT4 model

Example

from langchain.llms import TransformersLLM
llm = TransformersLLM.from_model_id(model_id="THUDM/chatglm-6b")
classmethod from_model_id(model_id: str, model_kwargs: Optional[dict] = None, **kwargs: Any) langchain.llms.base.LLM[source]#

Construct object from model_id

Parameters
  • model_id – Path for the huggingface repo id to be downloaded or the huggingface checkpoint folder.

  • model_kwargs – Keyword arguments that will be passed to the model and tokenizer.

  • kwargs – Extra arguments that will be passed to the model and tokenizer.

Returns

An object of TransformersLLM.

classmethod from_model_id_low_bit(model_id: str, model_kwargs: Optional[dict] = None, **kwargs: Any) langchain.llms.base.LLM[source]#

Construct low_bit object from model_id

Parameters
  • model_id – Path for the bigdl transformers low-bit model checkpoint folder.

  • model_kwargs – Keyword arguments that will be passed to the model and tokenizer.

  • kwargs – Extra arguments that will be passed to the model and tokenizer.

Returns

An object of TransformersLLM.

Native Model#

For llama/chatglm/bloom/gptneox/starcoder model families, you could also use the following LLM wrappers with the native (cpp) implementation for maximum performance.

class bigdl.llm.langchain.llms.bigdlllm.LlamaLLM(*args: Any, **kwargs: Any)[source]#

Bases: bigdl.llm.langchain.llms.bigdlllm._BaseCausalLM

validate_environment(values: Dict) Dict#

Validate that bigdl-llm is installed, family is supported

stream(prompt: str, stop: Optional[List[str]] = None, run_manager: Optional[langchain.callbacks.manager.CallbackManagerForLLMRun] = None) Generator[Dict, None, None]#

Yields results objects as they are generated in real time.

BETA: this is a beta feature while we figure out the right abstraction. Once that happens, this interface could change.

It also calls the callback manager’s on_llm_new_token event with similar parameters to the OpenAI LLM class method of the same name.

Parameters
  • prompt – The prompts to pass into the model.

  • stop – Optional list of stop words to use when generating.

Returns

A generator representing the stream of tokens being generated.

Yields

A dictionary like objects containing a string token and metadata. See llama-cpp-python docs and below for more.

Example

from langchain.llms import LlamaLLM
llm = LlamaLLM(
    model_path="/path/to/local/model.bin",
    temperature = 0.5
)
for chunk in llm.stream("Ask 'Hi, how are you?' like a pirate:'",
        stop=["'","\n"]):
    result = chunk["choices"][0]
    print(result["text"], end='', flush=True)
get_num_tokens(text: str) int#

Get the number of tokens that present in the text.

Useful for checking if an input will fit in a model’s context window.

Parameters

text – The string input to tokenize.

Returns

The number of tokens in the text.

Embeddings Wrapper of LangChain#

Hugging Face transformers AutoModel#

Wrapper around BigdlLLM embedding models.

class bigdl.llm.langchain.embeddings.transformersembeddings.TransformersEmbeddings(*args: Any, **kwargs: Any)[source]#

Bases: pydantic.BaseModel, langchain.embeddings.base.Embeddings

Wrapper around bigdl-llm transformers embedding models.

To use, you should have the transformers python package installed.

Example

from bigdl.llm.langchain.embeddings import TransformersEmbeddings
embeddings = TransformersEmbeddings.from_model_id(model_id)
classmethod from_model_id(model_id: str, model_kwargs: Optional[dict] = None, **kwargs: Any)[source]#

Construct object from model_id.

Parameters
  • model_id – Path for the huggingface repo id to be downloaded or the huggingface checkpoint folder.

  • model_kwargs – Keyword arguments that will be passed to the model and tokenizer.

  • kwargs – Extra arguments that will be passed to the model and tokenizer.

Returns

An object of TransformersEmbeddings.

embed(text: str, **kwargs)[source]#

Compute doc embeddings using a HuggingFace transformer model.

Parameters

texts – The list of texts to embed.

Returns

List of embeddings, one for each text.

embed_documents(texts: List[str]) List[List[float]][source]#

Compute doc embeddings using a HuggingFace transformer model.

Parameters

texts – The list of texts to embed.

Returns

List of embeddings, one for each text.

embed_query(text: str) List[float][source]#

Compute query embeddings using a bigdl-llm transformer model.

Parameters

text – The text to embed.

Returns

Embeddings for the text.

Native Model#

For llama/bloom/gptneox/starcoder model families, you could also use the following wrappers.

class bigdl.llm.langchain.embeddings.bigdlllm.LlamaEmbeddings(*args: Any, **kwargs: Any)[source]#

Bases: bigdl.llm.langchain.embeddings.bigdlllm._BaseEmbeddings

validate_environment(values: Dict) Dict#

Validate that bigdl-llm library is installed.

embed_documents(texts: List[str]) List[List[float]]#

Embed a list of documents using the optimized int4 model.

Parameters

texts – The list of texts to embed.

Returns

List of embeddings, one for each text.

embed_query(text: str) List[float]#

Embed a query using the optimized int4 model.

Parameters

text – The text to embed.

Returns

Embeddings for the text.