Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

Brings serverless GPU reasoning to hug face users

June 22, 2025

Will AI replace you or will it promote you? How to stay first

June 22, 2025

Fast set-fit reasoning with optimal Intel on Xeon

June 22, 2025
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Sunday, June 22
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
Versa AI hub
Home»Tools»Brings serverless GPU reasoning to hug face users
Tools

Brings serverless GPU reasoning to hug face users

versatileaiBy versatileaiJune 22, 2025No Comments3 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
#image_title
Share
Facebook Twitter LinkedIn Pinterest Email

Update (November 2024): The integration is no longer available. Switch to the inference API, inference endpoint, or other deployment options for hugging faces, depending on your AI model’s needs.

Today, we are excited to announce the launch of Deploy for CloudFlare Workers AI, a new integration for Hugging Face Hub. Deployed to CloudFlare Workers AI easily uses the open model as a serverless API with cutting-edge GPUs deployed in CloudFlare Edge data centers. Starting today, we’ve integrated some of the most popular open models that will hug your face to CloudFlare Worker AI with production solutions, including text generation inference.

Deploying CloudFlare Worker AI allows developers to build robust, generated AI applications without managing GPU infrastructure and servers, with very low operating costs.

Generated AI for developers

This new experience expands the strategic partnership that was announced last year to simplify access and deployment of open-generated AI models. One of the main issues faced by developers and organizations is the scarcity of GPU availability and the fixed cost of deploying servers to start buildings. CloudFlare Worker Deployment AI offers easy, low-cost solutions to these challenges, providing serverless access to popular embrace face models.

Let’s take a look at a concrete example. Imagine developing a RAG application that gets ~1000 requests per day and developing a 1K token input and a 100 token output using Meta Llama 2 7b. The production cost of LLM inference is approximately $1 per day.

“We look forward to achieving this integration very quickly. By putting the power of CloudFlare’s global network of serverless GPUs into the hands of developers, we are opening the door to many exciting innovations by communities around the world.”

How it works

It’s very easy to use embracing face models with CloudFlare Worker AI. Below are step-by-step instructions on how to use the Hermes 2 Pro with the latest model from Nous Research, the Mistral 7b.

You can find all the models available in this CloudFlare collection.

Note: You will need to access your CloudFlare account and API tokens.

You can find CloudFlare deployment options on all available model pages, including models such as Llama, Gemma, Mistral, and more.

Model Card

Open the Deployment menu and select CloudFlare Workers AI. This opens an interface on how to use this model and how to send requests.

Note: If the model you are using does not have the “CloudFlare Workers AI” option, it is currently not supported. We are working with CloudFlare to increase the availability of our models. Please contact us using your request at api-enterprise@huggingface.co.

Inference snippet

Currently, integration is available through two options. It can be used directly by workers using the Worker AI REST API or using the CloudFlare AI SDK. Select the option you want and copy the code to your environment. When using the REST API, you must ensure that the Account_ID and API_TOKEN variables are defined.

that’s it! You can now begin sending requests to hug face models hosted by CloudFlare Worker AI. Make sure to use the correct prompts and templates that your model expects.

I’ve just started

We are excited to work with CloudFlare to make AI more accessible for developers. Work with the CloudFlare team to make more models and experiences available!

author avatar
versatileai
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleWill AI replace you or will it promote you? How to stay first
versatileai

Related Posts

Tools

Fast set-fit reasoning with optimal Intel on Xeon

June 22, 2025
Tools

Hugging Wiz Research and facial partners to improve AI security

June 21, 2025
Tools

Text2SQL using Face Dataset Viewer API and MotherDuckDuckDB-NSQL-7B

June 21, 2025
Add A Comment

Comments are closed.

Top Posts

Piclumen Art V1: Next Generation AI Image Generation Model Launches for Digital Creators | Flash News Details

June 5, 20253 Views

New Star: Discover why 보니 is the future of AI art

February 26, 20252 Views

How to build an MCP server with Gradio

April 30, 20251 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

Piclumen Art V1: Next Generation AI Image Generation Model Launches for Digital Creators | Flash News Details

June 5, 20253 Views

New Star: Discover why 보니 is the future of AI art

February 26, 20252 Views

How to build an MCP server with Gradio

April 30, 20251 Views
Don't Miss

Brings serverless GPU reasoning to hug face users

June 22, 2025

Will AI replace you or will it promote you? How to stay first

June 22, 2025

Fast set-fit reasoning with optimal Intel on Xeon

June 22, 2025
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2025 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?