Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

ChatGPT 5.2 and state-of-the-art AI models: Comprehensive performance comparison and business impact analysis | AI News Details

December 25, 2025

AI-powered visual storytelling: Mootion_AI uses advanced generative tools to explore the Baader-Meinhof phenomenon | AI News Details

December 25, 2025

The future of rail: see, predict and learn

December 25, 2025
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Thursday, December 25
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources
Versa AI hub
Home»Tools»Introducing HUGS – Scale AI using open models
Tools

Introducing HUGS – Scale AI using open models

By December 27, 2024Updated:February 13, 2025No Comments5 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email

Today, we are excited to announce the launch of Hugging Face Generative AI Services (also known as HUGS). It is an optimized, zero-configuration inference microservice designed to simplify and accelerate the development of AI applications using open models. Built on open source Hugging Face technologies such as Text Generation Inference and Transformers, HUGS provides the best solution for efficiently building and scaling Generative AI applications on your own infrastructure. HUGS is optimized to run open models on a variety of hardware accelerators, including NVIDIA GPUs, AMD GPUs, and soon AWS Inferentia and Google TPUs.

Inference optimized with zero configuration for open models

HUGS simplifies optimized deployment of open models on proprietary infrastructure and a variety of hardware. One of the key challenges facing developers and organizations is the engineering complexity of optimizing LLM inference workloads on a given GPU or AI accelerator. HUGS allows for maximum throughput deployment of the most popular open LLMs with no configuration required. Each deployment configuration provided by HUGS is fully tested and maintained to work out of the box.

HUGS model deployment provides an OpenAI-compatible API for drop-in replacement of existing Generative AI applications built on model provider APIs. Simply point your code to a HUGS deployment and power your applications using an open model hosted on your own infrastructure.

Why do we hug?

HUGS provides an easy way to build AI applications using open models hosted on your own infrastructure, with the following benefits:

Within your infrastructure: Deploy an open model within your own secure environment. Keep your data and models off the internet. Zero-configuration deployment: HUGS reduces deployment time from weeks to minutes with zero-configuration setup and automatically optimizes model and service configurations for NVIDIA, AMD GPUs, or AI accelerators. Hardware-optimized inference: HUGS is built on Hugging Face’s text generation inference (TGI) and is optimized for best performance across a variety of hardware configurations. Hardware flexibility: You can run HUGS on a variety of accelerators, including NVIDIA GPUs and AMD GPUs. Support for AWS Inferentia and Google TPU will also be coming soon. Model flexibility: HUGS is compatible with a wide range of open source models, ensuring flexibility and choice for your AI applications. Industry-standard API: Easily deploy HUGS using Kubernetes with OpenAI API-compatible endpoints and minimal code changes. Enterprise Distribution: HUGS is an enterprise distribution of Hugging Face open source technology, offering long-term support, rigorous testing, and SOC2 compliance. Enterprise Compliance: Minimize compliance risks by including necessary licenses and terms of use.

We provided early access to HUGS to some Enterprise Hub customers.

HUGS saves you a lot of time to locally deploy ready-to-use models with great performance. Before HUGS, it used to take a week, but now it can be done in less than an hour. For customers with sovereign AI requirements, this is a game changer. – Henri Jouhaud, CTO, Polyconseil

I tried deploying Gemma 2 on GCP using L4 GPU with HUGS. There was no need to modify the library, version, or parameters; it could be used as is. HUGS gives you the confidence to extend your internal use of open models. – Ghislain Putois, Orange Research Engineer

structure

Using HUGS is easy. Here’s how to get started:

Note: Depending on the deployment method you choose, you will need access to the appropriate subscription or marketplace product.

where to find hugs

HUGS is available through several channels.

Cloud Service Provider (CSP) Marketplace: Find and deploy HUGS on Amazon Web Services (AWS) and Google Cloud Platform (GCP). Support for Microsoft Azure is coming soon. DigitalOcean: HUGS is available natively within DigitalOcean as a new 1-Click model service powered by Hugging Face HUGS and GPU Droplets. Enterprise Hub: If your organization has been upgraded to Enterprise Hub, contact your sales team to gain access to HUGS.

Please refer to the related documentation linked above for specific deployment instructions for each platform.

Pricing

HUGS offers on-demand pricing based on uptime for each container, except for deployments on DigitalOcean.

AWS Marketplace and Google Cloud Platform Marketplace: $1 per hour per container, no minimum charges (compute usage is billed separately by CSP). AWS offers a 5-day free trial period where you can test HUGS for free. DigitalOcean: The 1-Click model powered by Hugging Face HUGS is available at no additional charge on DigitalOcean. Regular GPU droplet compute costs apply. Enterprise Hub: Provides custom HUGS access for Enterprise Hub organizations. Please contact our sales team for more information.

Performing inference

HUGS is based on Text Generation Inference (TGI) and provides a seamless inference experience. For detailed instructions and examples, see the Performing Inference with HUGS guide. HUGS leverages the OpenAI-compatible Messages API, allowing you to use familiar tools and libraries to send requests, such as cURL, huggingface_hub SDK, and openai SDK.

from hug face hub import Inference client ENDPOINT_URL=“Exchange”

client = InferenceClient(base_url=ENDPOINT_URL, api_key=“-“) chat_completion = client.chat.completions.create(messages=( {“role”:“user”,“content”:“What is deep learning?”}, ), temperature=0.7top_p=0.95max tokens =128,)

Supported models and hardware

HUGS supports an open model and a growing ecosystem of hardware platforms. Please see the Supported Models and Supported Hardware page for the latest information.

Today we are releasing 13 popular open LLMs.

See the documentation for supported model x hardware details.

Get started with HUGS now

HUGS makes it easy to harness the power of open models with zero-configuration, optimized inference within your own infrastructure. With HUGS, you can take control of your AI applications and easily move proof-of-concept applications built in a closed model to an open model that you host yourself.

Get started today and deploy HUGS on AWS, Google Cloud, or DigitalOcean.

author avatar
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleMeta expects AI characters to generate social media content
Next Article Texas moves to curb artificial intelligence

Related Posts

Tools

The future of rail: see, predict and learn

December 25, 2025
Tools

Introducing Google’s latest Gemini AI model

December 24, 2025
Tools

The future of Arm and AI at the edge

December 23, 2025
Add A Comment

Comments are closed.

Top Posts

50,000 Copilot licenses acquired for Indian services companies

December 22, 20255 Views

New York Governor Kathy Hochul signs RAISE Act regulating AI safety

December 20, 20255 Views

Why content creation leads in Africa’s tourism’s AI adoption

May 5, 20255 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

50,000 Copilot licenses acquired for Indian services companies

December 22, 20255 Views

New York Governor Kathy Hochul signs RAISE Act regulating AI safety

December 20, 20255 Views

Why content creation leads in Africa’s tourism’s AI adoption

May 5, 20255 Views
Don't Miss

ChatGPT 5.2 and state-of-the-art AI models: Comprehensive performance comparison and business impact analysis | AI News Details

December 25, 2025

AI-powered visual storytelling: Mootion_AI uses advanced generative tools to explore the Baader-Meinhof phenomenon | AI News Details

December 25, 2025

The future of rail: see, predict and learn

December 25, 2025
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2025 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?