Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

Benchmarking large-scale language models for healthcare

June 8, 2025

Oracle plans to trade $400 billion Nvidia chips for AI facilities in Texas

June 8, 2025

Research papers provide a roadmap for AI advancements in Nigeria

June 7, 2025
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Sunday, June 8
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
Versa AI hub
Home»Tools»Working on hugging face reasoning providers
Tools

Working on hugging face reasoning providers

versatileaiBy versatileaiApril 18, 2025No Comments6 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
#image_title
Share
Facebook Twitter LinkedIn Pinterest Email

I’m excited to share that Cohere is a supported reasoning provider for HF Hub! This marks the first modeler who shares and delivers models directly in the hub.

Cohere is committed to building and servicing models built for enterprise use cases. From cutting-edge generator AI to powerful embedded and ranking models, their comprehensive suite of AI solutions is designed to tackle real business challenges. Additionally, Cohere Labs at House Research Lab is looking to support basic research and change the space in which research is conducted.

Now you can perform serverless inference to the following models via Cohere and inference providers:

Light up your projects today at Colles Lab and Collie Lab!

Coop model

Cohere and Cohere Labs bring swaths of models to inference providers that excel in specific business applications. Let’s explore some in detail.

Coherelabs/c4ai-command-a-03-2025šŸ”—

Optimized for demands that require fast, secure AI. The length of the 256K context (a major model twice as large) can handle much longer enterprise documents. Other important features include Cohere’s Advanced Searched Generation (RAG) with verifiable citations, the use of agent tools, enterprise-grade security, and powerful multilingual performance (supporting 23 languages).

COHERELABS/AYA-EXPANSE-32BšŸ”—

Less resource languages ​​focus on cutting-edge multilingual support. Arabic, Chinese (simplified, traditional), Czech, Dutch, English, French, German, Greek, Hebrew, Hebrew, Hindi, Indonesian, Italian, Japanese, Japanese, Korean, Persian, Polish, Portuguese, Romanian, Russian, Spanish, Turkish, Ukrainian, Vietnamese lengthens the 128k context.

Coherelabs/c4ai-command-r7b-12-2024šŸ”—

Perfect for low-cost or low-rise use cases, bringing cutting-edge performance in the class of open-weight models across real-world tasks. This model provides a context length of 128k. It provides a powerful combination of multilingual support, searched generation of citation validation (RAG), inference, tool use, and agent behavior. Multilingual model trained in 23 languages

Coherelabs/aya-vision-32bšŸ”—

A 32 billion parameter model with advanced features optimized for a variety of visual language use cases, including OCR, captions, visual inference, summaries, question answers, codes, and more. The multimodal functionality will be expanded to 23 languages ​​spoken by more than half of the world’s population.

How it works

You can use the Cohere model directly in the hub via the website UI or the client SDK.

You can find all the examples explained in this section on the Cohere documentation page.

In the website UI

You can search for Cohere models by filtering by the model hub inference provider.

Cohere Provider UI

From the model card, you can select an inference provider and execute inference directly in the UI.

GIF screenshots of Cohere Pruence Provider in UI

From the client SDK

Let’s walk using the Cohere model of the client SDK. I also created a colab notebook with these snippets in case I want to try it out right away.

I’m using Huggingface_hub from Python

The following example shows how to use the command that uses Cohere as the inference provider: For automatic routing through a hugging face, you can use a hugging face token.

Install huggingface_hub v0.30.0 or later:

Pip Install-u “huggingface_hub> = 0.30.0”

Use the Huggingface_hub Python library to call Cohere Endpoints with provider parameters defined.

from huggingface_hub Import Inference client=Inference client(provider=“Collie”,api_key =怌xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) Message = ({
“role”: “user”,
“content”: “How do you make a very spicy mayonnaise?”
}) complete = client.chat.completions.create(model =“Coherelabs/c4ai-command-r7b-12-2024”message = message, temperature =0.7,max_tokens =512,)

printing(complete.choices)0). message)

Multilingual and multimodal models from Aya Vision and Cohere Labs are also supported. You can include base64 encoded images as follows:

image_path = “img.jpg”
and open(image_path, “RB”)) As F:base64_image =base64.b64encode(f.read()). Decode (“UTF-8”)image_url = f “Data: image/jpeg; base64,{base64_image}“

from huggingface_hub Import Inference client=Inference client(provider=“Collie”,api_key =怌xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) Message = ({
“role”: “user”,
“content”:({
“type”: “Sentence”,
“Sentence”: “What is in this image?”
},{
“type”: “Image_url”,
“Image_url”:{“URL”:image_url}, },)})compley = client.chat.completions.create(model =“Coherelabs/aya-vision-32b”message = message, temperature =0.7,max_tokens =512,)

printing(complete.choices)0). message)

From JS using @huggingface/Incerence

Import { hfinference } from “@Huggingface/Inference”;

const Client= new hfinference(怌xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx);

const chatcompletion = wait client.ChatCompletion({
Model: “Coherelabs/c4ai-command-a-03-2025”,
message:({
role: “user”,
content: “How do you make a very spicy mayonnaise?”
}),
Provider: “Collie”,
max_tokens: 512
});

console.log(ChatCompletion.Choices(0).message);

From the Openai client

Here’s how to invoke command R7B using Cohere via Openai client library:

from Openai Import OpenAI client = openai(base_url =“https://router.huggingface.co/cohere/compatibility/v1”,api_key =怌xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) Message = ({
“role”: “user”,
“content”: “How do you make a very spicy mayonnaise?”
}) complete = client.chat.completions.create(model =“Command A-03-2025”message = message, temperature =0.7,)

printing(complete.choices)0). message)

Using tools in Cohere models

Cohere’s model brings the use of cutting-edge agent tools to inference providers, so take a closer look. Both the Hugging Face Hub client and the Openai client are compatible with the tool via inference provider, so you can extend the example above.

First, you need to define the tools that use the model. Below we define get_flight_info, which uses two locations to invoke the latest flight info API: This tool definition is represented in the chat template for the model. You can also explore it with a model card (Open Source).

Tool = ({
“type”: “function”,
“function”:{
“name”: “get_flight_info”,
“explanation”: “Get flight information between two cities or airports”,
“parameter”:{
“type”: “object”,
Properties:{
“loc_origin”:{
“type”: “string”,
“explanation”: “Departure airport, for example Mia.”,},
“loc_destination”:{
“type”: “string”,
“explanation”: “Destination airports, for example NYC.”,},},
“Required”šŸ™“loc_origin”, “loc_destination”), }, }, })

The model then needs to pass a message to the model to use the tool if it is relevant. The following example defines the tool call for the assistant Tool_Calls for clarity.

Message = ({“role”: “Developer”, “content”: “Today is April 30th.”},{
“role”: “user”,
“content”: “When is your next flight from Miami to Seattle?”,},{
“role”: “assistant”,
“Tool_Calls”:({
“function”:{
“Discussion”: ‘{“loc_destination”: “Seattle”, “loc_origin”: “miami”}’,
“name”: “get_flight_info”,},
“ID”: “get_flight_info0”,
“type”: “function”}), }, {
“role”: “tool”,
“name”: “get_flight_info”,
“tool_call_id”: “get_flight_info0”,
“content”: “From Miami to Seattle, May 1st, 10am.”,},)

Finally, the tools and messages are passed to the method of writing.

from huggingface_hub Import Inference client=Inference client(provider=“Collie”,api_key =怌xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx,) complete = client.chat.completions.create(model =“Coherelabs/c4ai-command-r7b-12-2024”message = message, tool = tool, temperature =0.7,max_tokens =512,)

printing(complete.choices)0). message)

Request

For direct requests, i.e. using a Cohere key, you will be billed directly to your Cohere account.

For routed requests, i.e. when authenticating through a hub, you only pay the standard Cohere API rate. There is no additional markup. Pass the provider’s costs directly. (In the future, we may establish a revenue sharing agreement with our provider partners.)

Important Memopia users get $2 worth of inference credits each month. You can use them between providers. šŸ”„

Subscribe to our Hugging Face Pro plan for access to inference credits, Zerogpu, Spaces Dev Mode, 20x high limits and more.

author avatar
versatileai
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleStart building with Gemini 2.5 Flash
Next Article How to Create Ghibli Style AI Art withgpt-4o
versatileai

Related Posts

Tools

Benchmarking large-scale language models for healthcare

June 8, 2025
Tools

Oracle plans to trade $400 billion Nvidia chips for AI facilities in Texas

June 8, 2025
Tools

The most comprehensive evaluation suite for GUI agents!

June 7, 2025
Add A Comment
Leave A Reply Cancel Reply

Top Posts

Deepseek’s latest AI model is a “big step back” for free speech

May 31, 20255 Views

Doudna Supercomputer to Strengthen AI and Genomics Research

May 30, 20255 Views

From California to Kentucky: Tracking the rise of state AI laws in 2025 | White & Case LLP

May 29, 20255 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

Deepseek’s latest AI model is a “big step back” for free speech

May 31, 20255 Views

Doudna Supercomputer to Strengthen AI and Genomics Research

May 30, 20255 Views

From California to Kentucky: Tracking the rise of state AI laws in 2025 | White & Case LLP

May 29, 20255 Views
Don't Miss

Benchmarking large-scale language models for healthcare

June 8, 2025

Oracle plans to trade $400 billion Nvidia chips for AI facilities in Texas

June 8, 2025

Research papers provide a roadmap for AI advancements in Nigeria

June 7, 2025
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2025 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?