We are excited to announce that we will add three more excellent serverless inference providers to our exaggerated facehub: Hyperbola, Nebius AI Studios, Novita. These providers join the growing ecosystem and directly enhance the breadth and capabilities of serverless inference on the hub’s model page. It also integrates seamlessly into the client SDK (both JS and Python), making it easy to use different models using preferred providers.
These partners will participate in the ranks of existing providers, including AI, Sambanova, Replicate, Fal, and Fireworks.ai.
New partners will enable strips for new models such as Deepseek-R1, Flux.1. Find all supported models below.
I’m so happy to see what to build with these new providers!
How it works
In the website UI
In User Account Settings, you set your own API key for the provider you signed up for. If no custom key is configured, the request is routed through HF. Order a provider if you like. This applies to model page widgets and code snippets.
As mentioned earlier, there is a custom key (the call will use its own API key of the corresponding inference provider) routed by HF (in that case there is a custom key (the call will go directly to the inference provider) . Providers, and fees apply directly to HF accounts, not to provider accounts)
The model page introduces third-party inference providers (compatible with current models sorted by user preferences)
From the client SDK
I’m using Huggingface_hub from Python
The following example shows how to use DeepSeek-R1 using hyperbola as the inference provider: Automatic routing through a hugging face can be used with a hugging face token or your own hyperbolic API key if you have one.
Install huggingface_hub from the source (see instructions). Official support will be released soon with version v0.29.0.
from huggingface_hub Import Inference client=Inference client(provider=“Hyperbolic”,api_key =ćxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
) Message = ({
“role”: “user”,
“content”: “What is the capital of France?”
}) complete = client.chat.completions.create(model =“deepseek-ai/deepseek-r1”message = message, max_tokens =500
))
printing(complete.choices)0). message)
Here’s how to generate images from a text prompt using Flux.1-dev running in Nebius AI Studio:
from huggingface_hub Import Inference client=Inference client(provider=“Nebius”,api_key =ćxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
)image = client.text_to_image(
“Bob Marley in the style of Johannes Burmeal painting”model =“Black-Forest-Labs/Flux.1-Schnell”
))
To move to another provider, you can simply change the provider name. Everything else remains the same.
From Huggingface_hub, Import ImperenceClient client = IsmerenceClient (
– Provider = “nebius”,
+ provider = “hyperbolic”,
api_key = “xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
From JS using @huggingface/Incerence
Import { hfinference } from “@Huggingface/Inference”;
const Client= new hfinference(ćxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx);
const chatcompletion = wait client.ChatCompletion({
Model: “deepseek-ai/deepseek-r1”,
message:({
role: “user”,
content: “What is the capital of France?”
}),
Provider: “Novita”,
max_tokens: 500
});
console.log(ChatCompletion.Choices(0).message);
Request
For direct requests, i.e. when using keys from inference providers, the corresponding provider will be billed. For example, if you are using a Nebius AI Studio key, your Nebius AI Studio account will be billed.
For routed requests, i.e. when authenticating through a hub, you only pay the standard provider API rate. There is no additional markup. Pass the provider’s costs directly. (In the future, we may establish a revenue sharing agreement with our provider partners.)
Important Memopia users get $2 worth of inference credits each month. You can use them between providers. š„
Subscribe to our Hugging Face Pro plan for access to inference credits, Zerogpu, Spaces Dev Mode, 20x high limits and more.
We also infer small allocations for sign-in free users for free, but upgrade to Pro if possible!
Feedback and next steps
We want to get your feedback! Here’s the hub discussions that can use https://huggingface.co/spaces/huggingface/huggingdiscussions/discussions/49