Hugging Face and FriendliAI partner to enhance model deployment in hubs

FriendliAI’s inference infrastructure is integrated into Hugging Face Hub as an option on the “Deploy this model” button to simplify and speed up the delivery of generative AI models.

Collaborate to drive AI innovation

Hugging Face enables developers, researchers, and businesses to innovate with AI. Our shared priority is to build impactful partnerships that simplify workflows and provide cutting-edge tools to the AI community.

Today, we are excited to announce a partnership between HF and FriendliAI, a leader in accelerating generative AI inference, to enhance the way developers deploy and manage AI models. This integration introduces the FriendliAI endpoint as a deployment option within Hugging Face Hub, giving developers direct access to high-performance, cost-effective inference infrastructure.

FriendliAI features breakthrough technologies such as continuous batch processing, native quantization, and best-in-class autoscaling, and is rated by Artificial Analysis as the fastest GPU-based generative AI inference provider. With this technology, FriendliAI continues to advance the standard in AI inference processing performance, delivering faster processing speeds, lower latency, and significant cost savings for deploying generative AI models at scale. Through this partnership, Hugging Face users and FriendliAI customers will be able to easily deploy open source or custom generative AI models with unparalleled efficiency and reliability.

Simplify model deployment

Last year, FriendliAI introduced the Hugging Face integration, allowing users to seamlessly deploy Hugging Face models directly within the Friendli Suite platform. With this integration, users now have access to thousands of open source models supported by Hugging Face and can easily deploy private models. A list of model architectures currently supported by FriendliAI can be found here.

Now, we’re taking this integration even further by enabling the same functionality directly within Hugging Face Hub and providing one-click deployment for a seamless user experience. You can use your Friendli Suite account to deploy models directly from the model card in Hugging Face Hub.

Selecting Friendli Endpoints takes you to the FriendliAI model deployment page. Here, you can interact with an optimized open source model and deploy it to NVIDIA H100 GPUs. The deployment page provides an intuitive interface for setting up Friendli D dedicated Endpoints, a managed service for generative AI inference. Additionally, you can chat directly with your open source model on the page during the deployment process, making it easy to explore and test its functionality.

Deploy models using NVIDIA H100 on Friendli-only endpoints

Using FriendliAI’s advanced GPU-optimized inference engine, dedicated endpoints deliver fast and cost-effective inference as a managed service. Developers can easily deploy open source or custom models to NVIDIA H100 GPUs using the Friendli dedicated endpoint by clicking (Deploy Now) on the model deployment page.

Although the H100 GPU is powerful, it can be expensive to operate at scale. FriendliAI’s optimized services can significantly reduce costs by reducing the number of GPUs required while maintaining peak performance. Dedicated endpoints are not only cost-effective, but also simplify the complexity of infrastructure management.

Inferring open source models using Friendli serverless endpoints

Friendli Serverless Endpoints is the perfect solution for developers who want to efficiently infer open source models. The service provides a user-friendly API for models optimized by FriendliAI, ensuring high performance at low cost. Chat directly with these powerful open source models on the model deployment page.

what’s next

We are excited to deepen our collaboration with FriendliAI<>HF and increase the accessibility of open source AI to developers around the world. FriendliAI’s fast and cost-effective inference solution eliminates the complexity of infrastructure management, allowing users to focus on innovation. Together with FriendliAI, we remain committed to transforming the way AI is developed and driving breakthrough innovations that will shape the next era of AI.

You can also follow our organization page to stay updated on upcoming news 🔥

See Full Bio

What's Hot

AI Art Trends 2024: God’s Hand Created with Primo Models on the Piclumen Platform | AI News Details

Efficient Multimodal Data Pipeline

Leading the Korean LLM evaluation ecosystem

Efficient Multimodal Data Pipeline

Leading the Korean LLM evaluation ecosystem

Welcome Gemma – Google’s new open LLM

Leading the Korean LLM evaluation ecosystem

Introducing the Red Team Resistance Leaderboard

Will AI apps help carry the mental load of moms?

Most Popular

Leading the Korean LLM evaluation ecosystem

Introducing the Red Team Resistance Leaderboard

Will AI apps help carry the mental load of moms?

Don't Miss

AI Art Trends 2024: God’s Hand Created with Primo Models on the Piclumen Platform | AI News Details

Efficient Multimodal Data Pipeline

Leading the Korean LLM evaluation ecosystem

Subscribe to Updates

What's Hot

Hugging Face and FriendliAI partner to enhance model deployment in hubs

Collaborate to drive AI innovation

Simplify model deployment

Deploy models using NVIDIA H100 on Friendli-only endpoints

Inferring open source models using Friendli serverless endpoints

what’s next

Related Posts