Hugging Face and FriendliAI partner to enhance model deployment in hubs

FriendliAI’s inference infrastructure is integrated into Hugging Face Hub as an option on the “Deploy this model” button to simplify and speed up the delivery of generative AI models.

Collaborate to drive AI innovation

Hugging Face enables developers, researchers, and businesses to innovate with AI. Our shared priority is to build impactful partnerships that simplify workflows and provide cutting-edge tools to the AI community.

Today, we are excited to announce a partnership between HF and FriendliAI, a leader in accelerating generative AI inference, to enhance the way developers deploy and manage AI models. This integration introduces the FriendliAI endpoint as a deployment option within Hugging Face Hub, giving developers direct access to high-performance, cost-effective inference infrastructure.

FriendliAI features breakthrough technologies such as continuous batch processing, native quantization, and best-in-class autoscaling, and is rated by Artificial Analysis as the fastest GPU-based generative AI inference provider. With this technology, FriendliAI continues to advance the standard in AI inference processing performance, delivering faster processing speeds, lower latency, and significant cost savings for deploying generative AI models at scale. Through this partnership, Hugging Face users and FriendliAI customers will be able to easily deploy open source or custom generative AI models with unparalleled efficiency and reliability.

Simplify model deployment

Last year, FriendliAI introduced the Hugging Face integration, allowing users to seamlessly deploy Hugging Face models directly within the Friendli Suite platform. With this integration, users now have access to thousands of open source models supported by Hugging Face and can easily deploy private models. A list of model architectures currently supported by FriendliAI can be found here.

Now, we’re taking this integration even further by enabling the same functionality directly within Hugging Face Hub and providing one-click deployment for a seamless user experience. You can use your Friendli Suite account to deploy models directly from the model card in Hugging Face Hub.

Selecting Friendli Endpoints takes you to the FriendliAI model deployment page. Here, you can interact with an optimized open source model and deploy it to NVIDIA H100 GPUs. The deployment page provides an intuitive interface for setting up Friendli D dedicated Endpoints, a managed service for generative AI inference. Additionally, you can chat directly with your open source model on the page during the deployment process, making it easy to explore and test its functionality.

Deploy models using NVIDIA H100 on Friendli-only endpoints

Using FriendliAI’s advanced GPU-optimized inference engine, dedicated endpoints deliver fast and cost-effective inference as a managed service. Developers can easily deploy open source or custom models to NVIDIA H100 GPUs using the Friendli dedicated endpoint by clicking (Deploy Now) on the model deployment page.

Although the H100 GPU is powerful, it can be expensive to operate at scale. FriendliAI’s optimized services can significantly reduce costs by reducing the number of GPUs required while maintaining peak performance. Dedicated endpoints are not only cost-effective, but also simplify the complexity of infrastructure management.

Inferring open source models using Friendli serverless endpoints

Friendli Serverless Endpoints is the perfect solution for developers who want to efficiently infer open source models. The service provides a user-friendly API for models optimized by FriendliAI, ensuring high performance at low cost. Chat directly with these powerful open source models on the model deployment page.

what’s next

We are excited to deepen our collaboration with FriendliAI<>HF and increase the accessibility of open source AI to developers around the world. FriendliAI’s fast and cost-effective inference solution eliminates the complexity of infrastructure management, allowing users to focus on innovation. Together with FriendliAI, we remain committed to transforming the way AI is developed and driving breakthrough innovations that will shape the next era of AI.

You can also follow our organization page to stay updated on upcoming news 🔥

See Full Bio

What's Hot

PicLumen AI celebrates Christmas with imaginative AI art inspiration for creators in 2025 | AI News Details

Google DeepMind’s cutting-edge predictive models

ChatGPT 5.2 and state-of-the-art AI models: Comprehensive performance comparison and business impact analysis | AI News Details

Google DeepMind’s cutting-edge predictive models

The future of rail: see, predict and learn

Introducing Google’s latest Gemini AI model

50,000 Copilot licenses acquired for Indian services companies

New York Governor Kathy Hochul signs RAISE Act regulating AI safety

What do they look like?

Most Popular

50,000 Copilot licenses acquired for Indian services companies

New York Governor Kathy Hochul signs RAISE Act regulating AI safety

What do they look like?

Don't Miss

PicLumen AI celebrates Christmas with imaginative AI art inspiration for creators in 2025 | AI News Details

Google DeepMind’s cutting-edge predictive models

ChatGPT 5.2 and state-of-the-art AI models: Comprehensive performance comparison and business impact analysis | AI News Details

Subscribe to Updates

What's Hot

Hugging Face and FriendliAI partner to enhance model deployment in hubs

Collaborate to drive AI innovation

Simplify model deployment

Deploy models using NVIDIA H100 on Friendli-only endpoints

Inferring open source models using Friendli serverless endpoints

what’s next

Related Posts