Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

One year since “Deep Seek Moment”

March 5, 2026

The most cost-effective AI model ever

March 4, 2026

Google’s industrial robot AI Play makes physical AI a priority

March 4, 2026
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Thursday, March 5
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources
Versa AI hub
Home»Research»HiddenLayer researchers quickly surface technology that bypasses all AI guardrails
Research

HiddenLayer researchers quickly surface technology that bypasses all AI guardrails

versatileaiBy versatileaiApril 25, 2025No Comments3 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
#image_title
Share
Facebook Twitter LinkedIn Pinterest Email

HiddenLayer revealed this week that researchers discovered rapid injection techniques to bypass instructional hierarchies and safety guardrails in all the major basic artificial intelligence (AI) models offered by OpenAI, Google, Humanity, Meta, Deepseek, Mistral and Alibaba.

CEO Chris Sestito said HiddenLayer researchers were able to employ a combination of internally developed policy techniques and role-play to generate output that violates policies related to chemistry, biology, radiation, nuclear research, mass violence, self-harm and system prompt leaks.

Specifically, HiddenLayer reports that previously disclosed policy puppet attacks can be used to reformulate the prompts to look like one of several types of policy files, such as XML, INI, or JSON. This approach allows cybercriminals to bypass system prompts and model-trained safety alignments.

AWS Hub

Disclosure of these AI vulnerabilities coincides with an update to the HiddenLayer platform to protect AI models that can also be used to create bills of AI materials (AIBOMs) in addition to providing the ability to track the genealogy of models.

Additionally, version 2.0 of its AISEC platform allows it to consolidate data from public sources such as embracing faces, allowing it to surface more practical intelligence on emerging machine learning security risks.

Finally, AISEC Platform 2.0 also provides access to updated dashboards that allow for deeper runtime analysis by providing greater visibility into rapid injection attempts, misuse patterns, and agent behavior.

Soon, HiddenLayer is also working on adding support for AI agents built on top of the AI ​​model.
In general, it is clear that AI model providers are much more focused on performance and accuracy than on security, Sestito said. The AI ​​model is inherently vulnerable, he added, despite the guardrails that may have been introduced.

If AI agents are allowed to access large-scale data, applications, and services, that problem will become even more problematic, Sestito noted. These AI agents are, in fact, a new type of identity that cybercriminals will undoubtedly find ways to compromise, he added.

But despite these concerns, organizations continue to deploy AI technology to ensure their cybersecurity teams are ultimately asked to ensure, Sestito said.

AI is not the first emerging technology that cybersecurity teams have been asked to secure after they have already been adopted, but the level of potential damage that could be inflicted by violations of AI models or agents can be devastating. There is greater awareness about this issue today than last year’s period, but it has become clear that there is a lot to be clear about the need to secure AI technology.

Of course, there is a limited number of cybersecurity experts with AI expertise, and a limited number of AI experts with even fewer cybersecurity concerns. Thus, it may not be as much a matter of whether there are major AI security incidents as possible as what harm is done before more attention is paid.

author avatar
versatileai
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleFirefly brings AI-powered content creation to your creative cloud and marketing workflows
Next Article AI hits humanity, the environment remains dangerously unclear, government agencies warn
versatileai

Related Posts

Research

New AI research clarifies the origins of Papua New Guineans

July 22, 2025
Research

AI helps prevent medical errors in real clinics

July 22, 2025
Research

No one is surprised, and a new study says that AI overview causes a significant drop in search clicks

July 22, 2025
Add A Comment

Comments are closed.

Top Posts

Open Source DeepResearch – Unlocking Search Agents

February 7, 20259 Views

Improving the accuracy of multimodal search and visual document retrieval using the Llama Nemotron RAG model

January 7, 20267 Views

5 ways rules and regulations guide AI innovation

January 7, 20265 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

Open Source DeepResearch – Unlocking Search Agents

February 7, 20259 Views

Improving the accuracy of multimodal search and visual document retrieval using the Llama Nemotron RAG model

January 7, 20267 Views

5 ways rules and regulations guide AI innovation

January 7, 20265 Views
Don't Miss

One year since “Deep Seek Moment”

March 5, 2026

The most cost-effective AI model ever

March 4, 2026

Google’s industrial robot AI Play makes physical AI a priority

March 4, 2026
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2026 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?