Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

Aluminum OS is the AI-powered successor to ChromeOS

December 7, 2025

Complete Swift client for Hugging Face

December 6, 2025

AlphaFold reveals key proteins behind heart disease

December 6, 2025
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Sunday, December 7
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources
Versa AI hub
Home»Research»HiddenLayer researchers quickly surface technology that bypasses all AI guardrails
Research

HiddenLayer researchers quickly surface technology that bypasses all AI guardrails

versatileaiBy versatileaiApril 25, 2025No Comments3 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
#image_title
Share
Facebook Twitter LinkedIn Pinterest Email

HiddenLayer revealed this week that researchers discovered rapid injection techniques to bypass instructional hierarchies and safety guardrails in all the major basic artificial intelligence (AI) models offered by OpenAI, Google, Humanity, Meta, Deepseek, Mistral and Alibaba.

CEO Chris Sestito said HiddenLayer researchers were able to employ a combination of internally developed policy techniques and role-play to generate output that violates policies related to chemistry, biology, radiation, nuclear research, mass violence, self-harm and system prompt leaks.

Specifically, HiddenLayer reports that previously disclosed policy puppet attacks can be used to reformulate the prompts to look like one of several types of policy files, such as XML, INI, or JSON. This approach allows cybercriminals to bypass system prompts and model-trained safety alignments.

AWS Hub

Disclosure of these AI vulnerabilities coincides with an update to the HiddenLayer platform to protect AI models that can also be used to create bills of AI materials (AIBOMs) in addition to providing the ability to track the genealogy of models.

Additionally, version 2.0 of its AISEC platform allows it to consolidate data from public sources such as embracing faces, allowing it to surface more practical intelligence on emerging machine learning security risks.

Finally, AISEC Platform 2.0 also provides access to updated dashboards that allow for deeper runtime analysis by providing greater visibility into rapid injection attempts, misuse patterns, and agent behavior.

Soon, HiddenLayer is also working on adding support for AI agents built on top of the AI ​​model.
In general, it is clear that AI model providers are much more focused on performance and accuracy than on security, Sestito said. The AI ​​model is inherently vulnerable, he added, despite the guardrails that may have been introduced.

If AI agents are allowed to access large-scale data, applications, and services, that problem will become even more problematic, Sestito noted. These AI agents are, in fact, a new type of identity that cybercriminals will undoubtedly find ways to compromise, he added.

But despite these concerns, organizations continue to deploy AI technology to ensure their cybersecurity teams are ultimately asked to ensure, Sestito said.

AI is not the first emerging technology that cybersecurity teams have been asked to secure after they have already been adopted, but the level of potential damage that could be inflicted by violations of AI models or agents can be devastating. There is greater awareness about this issue today than last year’s period, but it has become clear that there is a lot to be clear about the need to secure AI technology.

Of course, there is a limited number of cybersecurity experts with AI expertise, and a limited number of AI experts with even fewer cybersecurity concerns. Thus, it may not be as much a matter of whether there are major AI security incidents as possible as what harm is done before more attention is paid.

author avatar
versatileai
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleFirefly brings AI-powered content creation to your creative cloud and marketing workflows
Next Article AI hits humanity, the environment remains dangerously unclear, government agencies warn
versatileai

Related Posts

Research

New AI research clarifies the origins of Papua New Guineans

July 22, 2025
Research

AI helps prevent medical errors in real clinics

July 22, 2025
Research

No one is surprised, and a new study says that AI overview causes a significant drop in search clicks

July 22, 2025
Add A Comment

Comments are closed.

Top Posts

UK and Germany plan to commercialize quantum supercomputing

December 5, 20255 Views

Tencent launches Hunyuan 3D AI asset generation engine

December 3, 20255 Views

Complete Swift client for Hugging Face

December 6, 20254 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

UK and Germany plan to commercialize quantum supercomputing

December 5, 20255 Views

Tencent launches Hunyuan 3D AI asset generation engine

December 3, 20255 Views

Complete Swift client for Hugging Face

December 6, 20254 Views
Don't Miss

Aluminum OS is the AI-powered successor to ChromeOS

December 7, 2025

Complete Swift client for Hugging Face

December 6, 2025

AlphaFold reveals key proteins behind heart disease

December 6, 2025
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2025 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?