We are excited to announce the launch of Langchain_Huggingface, a partner package for Langchain, which is jointly maintained by hugging Face and Langchain. This new Python package is designed to bring the latest development power of embracing faces to Langchain and keep them up to date.
All of the face-related classes in Langchain were coded by the community and we were thriving on this, but over time some of them were condemned for the lack of an insider’s perspective.
By becoming a partner package, we aim to reduce the time it takes to offer new features available on the Face Ecosystem, which is embracing Langchain users.
Langchain-Huggingface integrates seamlessly with Langchain, providing an efficient and effective way to utilize the hugging face model within the Langchain ecosystem. This partnership is not only a shared technology, but also a joint commitment to maintaining and continually improving this integration.
Get started
Getting started with Langchain-Huggingface is easy. Here’s how to install and use the package:
Install PIP Langchain-Huggingface
Now that the package is installed, let’s take a tour of the contents!
LLMS
Hagging facepipeline
Among transformers, pipelines are the most versatile tool in the hugging face toolbox. Langchain, primarily designed to address RAG and agent usage cases, here the scope of the pipeline is reduced to the following text-centric tasks: “Text Generation”, “Text2Text-Generation”, “summarization”, “Translation”.
The model can be loaded directly using the from_model_id method.
from langchain_huggingface Import HuggingFacepipeline LLM = HuggingFacepipeline.from_model_id(model_id =“Microsoft/Phi-3-mini-4k-instruct”task =“Text Generation”,pipeline_kwargs = {
“max_new_tokens”: 100,
“TOP_K”: 50,
“temperature”: 0.1,},) llm.invoke (“Hugging my face.”))
Alternatively, you can define the pipeline yourself before passing it to the class.
from transformer Import Automodelforcausallm, autotokenizer, pipeline model_id = “Microsoft/Phi-3-mini-4k-instruct”
tokenizer = autotokenizer.from_pretrained(model_id) model = automodelforcausallm.from_pretrained(model_id, load_in_4bit =truth,) pipe = pipeline (“Text Generation”,Model = Model,Tokenizer =Tokenizer,Max_new_tokens =100top_k =50temperature =0.1) llm = huggingfacepipeline (pipeline = pipe) llm.invoke (“Hugging my face.”))
Using this class, the model is loaded into the cache and uses the computer’s hardware. Therefore, it may be limited by resources available to your computer.
HuggingfaceEndpoint
There are two ways to use this class: You can specify the model using the repo_id parameter. These endpoints use serverless APIs. This is especially beneficial for those using a Pro account or an Enterprise Hub. Still, normal users can access a significant amount of requests by connecting with the HF token in the environment where they are running their code.
from langchain_huggingface Import HuggingfaceEndPoint LLM = HuggingfaceEndPoint(repo_id =“Metalama/Metalama-3-8b-instruct”task =“Text Generation”,max_new_tokens =100do_sample =error)llm.invoke (“Hugging my face.”)llm = huggingfaceEndpoint(endpoint_url =“”task =“Text Generation”,max_new_tokens =1024do_sample =error)llm.invoke (“Hugging my face.”))
Under the hood, this class uses guessing power to provide serverless APIs for various use cases to allow TGI instances to be deployed.
Chathuggingface
Every model has its own special token that works best. If those tokens are not added to the prompt, the model will perform very poorly
If you go to the completion prompt from the list of messages, you will find an attribute that is present in most LLM talkers called chat_template.
For more information about chat_templates for various models, visit this space I’ve created!
This class is a wrapper around other LLMs. Once you enter a list of messages, use the tokenizer.apply_chat_template method to create the correct completion prompt.
from langchain_huggingface Import Chathuggingface, HuggingfaceEndpoint LLM = HuggingfaceEndpoint(endpoint_url =“”task =“Text Generation”,max_new_tokens =1024do_sample =error,) llm_engine_hf = chathuggingface(llm = llm) llm_engine_hf.invoke(“Hugging my face.”))
The above code is:
llm.invoke (“(inst) hugging face is (/inst)”)llm.invoke (“” <| begin_of_text |> <| start_header_id |> user <| end_header_id |> hugging face is <| eot_id |> <| start_header_id |> Assistant <| end_header_id |> “” “”))
embedded
The embracing face is filled with a much stronger embedded model than can be utilized directly in the pipeline.
First select the model. One great resource for choosing an embedded model is the MTEB leaderboard.
Huggingfacembeddings
This class uses statement converter embedding. It is locally embedded, so it uses computer resources to calculate it.
from langchain_huggingface.embeddings Import huggingfacembeddings model_name = “MixedBread-Ai/Mxbai-embed-rarage-v1”
hf_embeddings = huggingfacembeddings(model_name = model_name,)texts =(“Hello World!”, “how are you?”)hf_embeddings.embed_documents (text)
HuggingfaceEndpointembedings
HuggingfaceEndpointembeddings is very similar to what HuggingfaceEndpoint does for LLM. It can be used in models on hubs, and TEI can instance whether it is deployed locally or online.
from langchain_huggingface.embeddings Import HuggingfaceEndpointembedings hf_embeddings = huggingfaceEndpointembedings(model = “MixedBread-Ai/Mxbai-embed-rarage-v1”task =“Feature Extraction”,huggingfacehub_api_token =“”,)texts =(“Hello World!”, “how are you?”)hf_embeddings.embed_documents (text)
Conclusion
We are working to improve our Langchain-Huggingface by the day. We actively monitor feedback and issues and work to address them as soon as possible. It also adds new features and features, expands the package to support a wider range of use cases in the community. I highly recommend trying this package and expressing your opinion as it paves the way for the future of the package.