Scaleway hugging face reasoning provider

I’m excited to share that Scaleway is a supported reasoning provider for Hug Face Hub! Scaleway joins a growing ecosystem and directly enhances the breadth and capabilities of serverless inference on the hub’s model page. Inference providers are seamlessly integrated into the client SDK (both JS and Python), making it easy to use different models using preferred providers.

With this launch, popular open weight models like the GPT-Oss, QWEN3, Deepseek R1, and Gemma 3 are easier to access than ever. You can browse your Scaleway organization in the hub at https://huggingface.co/scaleway and try out supported models at https://huggingface.co/models?inference_provider =scaleway&sort=treending.

The Scaleway Generative API is a fully managed serverless service that provides access to the frontier AI models of major research labs through simple API calls. The service offers competitive per token pricing starting at 0.20 euros per million tokens.

The service runs on a secure infrastructure in the European data center (Paris, France) and ensures data sovereignty and low latency for European users. The platform supports advanced features such as structured output, function calls and multimodal features for both text and image processing.

Built for production, Scaleway’s inference infrastructure provides sub-200ms response time for the first token, making it ideal for interactive applications and agent workflows. This service supports both text generation and embedded models. Find out more about Scaleway’s platform and infrastructure at https://www.scaleway.com/en/generative-apis/.

Learn more about using Scaleway as an inference provider on our dedicated documentation page.

See the list of supported models here.

How it works

In the website UI

In User Account Settings, you set your own API key for the provider you signed up for. If no custom key is configured, the request is routed through HF. Order a provider if you like. This applies to model page widgets and code snippets.

As mentioned before, when calling an inference provider there are two modes: a custom key (the call goes directly to the inference provider, using the corresponding inference provider’s own API key) (in that case no tokens are required from the provider.

The model page introduces third-party inference providers (compatible with current models sorted by user preferences)

From the client SDK

I’m using Huggingface_hub from Python

The following example shows how to use Swiss AI’s Apertus-70B using Scaleway as the inference provider: Automatic routing through a hugging face can be used with a hugging face token or your own Scaleway API key if you have one.

Note: This requires using a recent version of Huggingface_hub (>=0.34.6).

Import OS
from huggingface_hub Import Inference client=Inference client(provider=“Scaleway”,api_key = os.environ(“HF_TOKEN”) ) message = ({
“role”: “user”,
“content”: “Writing poetry in Shakespeare’s style”
}) complete = client.chat.completions.create(model =“Openai/gpt-oss-120b”message = message,)

printing(complete.choices)0). message)

From JS using @huggingface/Incerence

Import { inference } from “@Huggingface/Inference”;

const Client= new inference(process.Env.hf_token);

const chatcompletion = wait client.ChatCompletion({
Model: “Openai/gpt-oss-120b”,
message:({
role: “user”,
content: “Writing poetry in Shakespeare’s style”,},),
Provider: “Scaleway”,});

console.log(ChatCompletion.Choices(0).message);

Request

Here’s how billing works:

For direct requests, i.e. when using keys from inference providers, the corresponding provider will be billed. For example, if you are using a Scaleway API key, your Scaleway account will be billed.

For routed requests, i.e. when authenticating through a facehub that hugs, you only pay the standard provider API rate. There is no additional markup from us. It simply passes the provider’s costs directly. (In the future, we may establish a revenue sharing agreement with our provider partners.)

Important Memopia users get $2 worth of inference credits each month. You can use them between providers. 🔥

Subscribe to our Hugging Face Pro plan for access to inference credits, Zerogpu, Spaces Dev Mode, 20x high limits and more.

We also infer small allocations for sign-in free users for free, but upgrade to Pro if possible!

Feedback and next steps

We want to get your feedback! Share your thoughts and comments here: https://huggingface.co/spaces/huggingface/huggingdiscussions/discussions/49

versatileai

See Full Bio

What's Hot

The future of physical AI revealed in the LG and NVIDIA meeting

How to build scalable web apps using OpenAI privacy filters

Per-token AI fees coming to GitHub Copilot

The future of physical AI revealed in the LG and NVIDIA meeting

How to build scalable web apps using OpenAI privacy filters

Per-token AI fees coming to GitHub Copilot

DeepInfra on Hug Face Inference Provider 🔥

Soulgen revolutionizes the creation of NSFW content

Per-token AI fees coming to GitHub Copilot

Most Popular

DeepInfra on Hug Face Inference Provider 🔥

Soulgen revolutionizes the creation of NSFW content

Per-token AI fees coming to GitHub Copilot

Don't Miss

The future of physical AI revealed in the LG and NVIDIA meeting

How to build scalable web apps using OpenAI privacy filters

Per-token AI fees coming to GitHub Copilot

Subscribe to Updates

What's Hot

Scaleway hugging face reasoning provider

How it works

In the website UI

From the client SDK

I’m using Huggingface_hub from Python

From JS using @huggingface/Incerence

Request

Feedback and next steps

Related Posts