Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

Gemini 2.5: Updated Thinking Model Family

June 18, 2025

A collaborative effort to maintain application resilience

June 17, 2025

Samsung R&D Institute, IIT Madras signs MOU to promote research in AI such as Indian language, HealthTech | Education

June 17, 2025
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Wednesday, June 18
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
Versa AI hub
Home»Cybersecurity»F5 expands NVIDIA LLM Routing and Security-enhanced AI Infrastructure
Cybersecurity

F5 expands NVIDIA LLM Routing and Security-enhanced AI Infrastructure

versatileaiBy versatileaiJune 17, 2025No Comments5 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
#image_title
Share
Facebook Twitter LinkedIn Pinterest Email

F5, a global leader in the delivery and protection of all apps and APIs, has announced new features for Kubernetes’ F5 Big-IP, accelerated with the NVIDIA Bluefield-3 DPUS and NVIDIA Doca software framework.

Sesterce is a leading European operator specializing in next-generation infrastructure and sovereign AI, designed to meet the needs of accelerated computing and artificial intelligence.





Komarco

With the expansion of F5 application delivery and security platform, Kubernetes’ Big-IP, running natively on NVIDIA Bluefield-3 DPUs, provides high-performance traffic management and security for large-scale AI infrastructures, unlocking the efficiency, control and performance improvements of AI applications. In conjunction with the persuasive performance benefits announced earlier this year along with general availability, Sesterce successfully completed validation of its F5 and NVIDIA solutions across many key features, including the following areas:

Increased performance, multi-tenancy and security to meet cloud-grade expectations. Initially, GPU usage was improved by 20%. Integration with Nvidia Dynamo and KV Cache Manager reduces latency for inference and GPU and memory resource optimization in large-scale language model (LLM) inference systems. Smart LLM Routing on Bluefield DPUS runs effectively on NVIDIA NIM microservices for workloads that require multiple models, providing the best of all available models. The Scaling and Secure Model Context Protocol (MCP) includes reverse proxy functionality and protection for a more scalable and secure LLM, ensuring the power of your MCP server is quickly and securely available. Powerful data programmers with robust F5 IRULES capabilities allow for rapid customization to support AI applications and evolving security requirements.

The highlights of the new solution feature include:

Using this collaborative solution, LLM routing and dynamic load balancing using Kubernetes’ Big-IP allows you to route simple AI-related tasks to cheaper, lighter LLMs when supporting generated AI, while booking advanced models for complex queries. This level of customizable intelligence also allows routing functions to leverage domain-specific LLM, improving output quality and significantly improving customer experience. F5’s advanced traffic management ensures that queries are sent to the most appropriate LLM, lowering latency and improving time to the first token. Optimization to infer large scale GPUs for distributed AI inference using Nvidia Dynamo and KV cache integration

Earlier this year, Nvidia Dynamo was introduced and a supplementary framework was provided for deploying generated AI and inference models in large distributed environments. Nvidia Dynamo streamlines the complexity of running AI inference in a distributed environment by coordinating tasks such as scheduling, routing, and memory management to ensure seamless operations under dynamic workloads. Offloading specific operations from the CPU to the Blue Field DPU is one of the central advantages of the F5 and NVIDIA combination solution. With F5, the Dynamo KV cache manager feature can intelligently route based on capacity to accelerate generated AI use cases by using key value (kV) caches to accelerate processes based on retaining information from previous operations. From an infrastructure perspective, organizations that store and reuse KV cache data can do so for a small fraction of the cost of using GPU memory for this purpose.

Improved protection for MCP servers with F5 and NVIDIA

The Model Context Protocol (MCP) is an open protocol developed by mankind that standardizes the way applications provide context to LLM. By deploying a combination F5 and NVIDIA solution before the MCP server, F5 technology acts as a reverse proxy, enhancing the security features of the MCP solution and the LLM that it supports. Additionally, the full data programming enabled by F5 IRULES promotes rapid adaptation and resilience of rapidly evolving AI protocol requirements, and further protection against emerging cybersecurity risks.

The next big-ip for Kubernetes deployed on f5 nvidia bluefield-3 dpus is generally available now. For more information about other technologies and the benefits of deployment, visit www.f5.com and visit the Nvidia GTC Paris businesses that are part of this week’s Vivatech 2025 event. For more information, please also check out the F5 companion blog.

Youssef El Manssouri, CEO and co-founder of Sesterce

The integration of F5 and Nvidia was appealing even before testing. Our results highlight the advantages of the dynamic load balancing of F5 due to massive quantities of Kubernetes intrusion and exit in AI environments. This approach allows you to distribute traffic more efficiently and bring additional unique value to your customers while optimizing GPU usage. We are pleased to hear F5’s support for the increase in NVIDIA use cases, including enhancements to Multi-Tanancy. We look forward to additional innovations between businesses supporting the next generation of AI infrastructure.

Kunal Anand, F5’s top innovation officer

While enterprises are increasingly deploying multiple LLMs to power advanced AI experiences, routing and classification of LLM traffic is a highly compute-heavy, degraded performance and user experience. By programming the routing logic directly on the Nvidia Bluefield-3 DPUS, F5 Big-IP is the most efficient approach to delivering and securing LLM traffic next to Kubernetes. This is just the beginning. Our platform unlocks new possibilities for AI infrastructure and is excited to deepen our joint breach with Nvidia as enterprise AI continues to expand.

Ash Bhalgat, Senior Director of AI Networking and Security Solutions at Ecosystem and Marketing, AIN Networking and Security Solutions

Accelerated with Nvidia Bluefield-3 DPUs, Kubernetes’ Big-IP gives businesses and service providers a single control point to efficiently route traffic to AI factories, optimizing GPU efficiency and accelerates AI traffic for data invetting, model training, inference, RAG and agent AI. Additionally, F5’s support for multitenancy and programmerization using IRULE continues to provide a suitable platform for ongoing integration and functionality, including support for NVIDIA Dynamo Distributed KV Cache Manager.

Greg Schoeny, SVP, and Global Service Provider for World Wide Technology

Organizations implementing Agent AI are increasingly relying on MCP deployments to improve security and performance of LLMS. By bringing advanced traffic management and security to a wide range of Kubernetes environments, F5 and Nvidia offer an integrated set of AI features that are not currently found elsewhere in the industry.

author avatar
versatileai
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleKREA 1 Image Model launches with excellent aesthetic controls and custom training for AI art generation | AI News Details
Next Article GROQ hugging face reasoning provider
versatileai

Related Posts

Cybersecurity

Pentagon Awards Openai $200 million AI contract for national security

June 17, 2025
Cybersecurity

Senator Gounardes’ AI Safety Bill passes the state Senate

June 13, 2025
Cybersecurity

American AI Advocacy: Mourenar, a Bipartisan Group introduces advanced AI Security Preparation Methods

June 12, 2025
Add A Comment

Comments are closed.

Top Posts

Piclumen Art V1: Next Generation AI Image Generation Model Launches for Digital Creators | Flash News Details

June 5, 20253 Views

Presight plans to expand its AI business internationally

April 14, 20252 Views

PlanetScale Vectors GA: MySQL and AI Database Game Changer

April 14, 20252 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

Piclumen Art V1: Next Generation AI Image Generation Model Launches for Digital Creators | Flash News Details

June 5, 20253 Views

Presight plans to expand its AI business internationally

April 14, 20252 Views

PlanetScale Vectors GA: MySQL and AI Database Game Changer

April 14, 20252 Views
Don't Miss

Gemini 2.5: Updated Thinking Model Family

June 18, 2025

A collaborative effort to maintain application resilience

June 17, 2025

Samsung R&D Institute, IIT Madras signs MOU to promote research in AI such as Indian language, HealthTech | Education

June 17, 2025
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2025 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?