Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

Techex Europe 2025: Practical learning for AI leaders

September 20, 2025

Scaleway hugging face reasoning provider

September 20, 2025

Discover new solutions to the first century’s problems in fluid dynamics

September 19, 2025
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Saturday, September 20
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources
Versa AI hub
Home»Tools»Brings serverless GPU reasoning to hug face users
Tools

Brings serverless GPU reasoning to hug face users

versatileaiBy versatileaiJune 22, 2025No Comments3 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
#image_title
Share
Facebook Twitter LinkedIn Pinterest Email

Update (November 2024): The integration is no longer available. Switch to the inference API, inference endpoint, or other deployment options for hugging faces, depending on your AI model’s needs.

Today, we are excited to announce the launch of Deploy for CloudFlare Workers AI, a new integration for Hugging Face Hub. Deployed to CloudFlare Workers AI easily uses the open model as a serverless API with cutting-edge GPUs deployed in CloudFlare Edge data centers. Starting today, we’ve integrated some of the most popular open models that will hug your face to CloudFlare Worker AI with production solutions, including text generation inference.

Deploying CloudFlare Worker AI allows developers to build robust, generated AI applications without managing GPU infrastructure and servers, with very low operating costs.

Generated AI for developers

This new experience expands the strategic partnership that was announced last year to simplify access and deployment of open-generated AI models. One of the main issues faced by developers and organizations is the scarcity of GPU availability and the fixed cost of deploying servers to start buildings. CloudFlare Worker Deployment AI offers easy, low-cost solutions to these challenges, providing serverless access to popular embrace face models.

Let’s take a look at a concrete example. Imagine developing a RAG application that gets ~1000 requests per day and developing a 1K token input and a 100 token output using Meta Llama 2 7b. The production cost of LLM inference is approximately $1 per day.

“We look forward to achieving this integration very quickly. By putting the power of CloudFlare’s global network of serverless GPUs into the hands of developers, we are opening the door to many exciting innovations by communities around the world.”

How it works

It’s very easy to use embracing face models with CloudFlare Worker AI. Below are step-by-step instructions on how to use the Hermes 2 Pro with the latest model from Nous Research, the Mistral 7b.

You can find all the models available in this CloudFlare collection.

Note: You will need to access your CloudFlare account and API tokens.

You can find CloudFlare deployment options on all available model pages, including models such as Llama, Gemma, Mistral, and more.

Model Card

Open the Deployment menu and select CloudFlare Workers AI. This opens an interface on how to use this model and how to send requests.

Note: If the model you are using does not have the “CloudFlare Workers AI” option, it is currently not supported. We are working with CloudFlare to increase the availability of our models. Please contact us using your request at api-enterprise@huggingface.co.

Inference snippet

Currently, integration is available through two options. It can be used directly by workers using the Worker AI REST API or using the CloudFlare AI SDK. Select the option you want and copy the code to your environment. When using the REST API, you must ensure that the Account_ID and API_TOKEN variables are defined.

that’s it! You can now begin sending requests to hug face models hosted by CloudFlare Worker AI. Make sure to use the correct prompts and templates that your model expects.

I’ve just started

We are excited to work with CloudFlare to make AI more accessible for developers. Work with the CloudFlare team to make more models and experiences available!

author avatar
versatileai
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleSix Secrets of a Superworker Company – Josh Bershin
Next Article Microbets on trend AI for cybersecurity companies offset US tariff mishaps
versatileai

Related Posts

Tools

Techex Europe 2025: Practical learning for AI leaders

September 20, 2025
Tools

Scaleway hugging face reasoning provider

September 20, 2025
Tools

Discover new solutions to the first century’s problems in fluid dynamics

September 19, 2025
Add A Comment

Comments are closed.

Top Posts

Direct integration with embracing face

August 30, 20251 Views

California gives breathing chambers to price discrimination for AI

August 30, 20251 Views

Pixverse V5 AI ART Generator sees record demand in its first 96 hours: Market Opportunities and User Trends | AI News Details

August 29, 20251 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

Direct integration with embracing face

August 30, 20251 Views

California gives breathing chambers to price discrimination for AI

August 30, 20251 Views

Pixverse V5 AI ART Generator sees record demand in its first 96 hours: Market Opportunities and User Trends | AI News Details

August 29, 20251 Views
Don't Miss

Techex Europe 2025: Practical learning for AI leaders

September 20, 2025

Scaleway hugging face reasoning provider

September 20, 2025

Discover new solutions to the first century’s problems in fluid dynamics

September 19, 2025
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2025 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?