Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

AI Art Generation Using Primo Models: Unlock Creative Business Opportunities in 2024 | AI News Details

July 5, 2025

Benchmarks for speech models from wild text

July 5, 2025

Creating innovative content at your fingertips

July 4, 2025
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Saturday, July 5
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
Versa AI hub
Home»Research»HiddenLayer researchers quickly surface technology that bypasses all AI guardrails
Research

HiddenLayer researchers quickly surface technology that bypasses all AI guardrails

versatileaiBy versatileaiApril 25, 2025No Comments3 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
#image_title
Share
Facebook Twitter LinkedIn Pinterest Email

HiddenLayer revealed this week that researchers discovered rapid injection techniques to bypass instructional hierarchies and safety guardrails in all the major basic artificial intelligence (AI) models offered by OpenAI, Google, Humanity, Meta, Deepseek, Mistral and Alibaba.

CEO Chris Sestito said HiddenLayer researchers were able to employ a combination of internally developed policy techniques and role-play to generate output that violates policies related to chemistry, biology, radiation, nuclear research, mass violence, self-harm and system prompt leaks.

Specifically, HiddenLayer reports that previously disclosed policy puppet attacks can be used to reformulate the prompts to look like one of several types of policy files, such as XML, INI, or JSON. This approach allows cybercriminals to bypass system prompts and model-trained safety alignments.

AWS Hub

Disclosure of these AI vulnerabilities coincides with an update to the HiddenLayer platform to protect AI models that can also be used to create bills of AI materials (AIBOMs) in addition to providing the ability to track the genealogy of models.

Additionally, version 2.0 of its AISEC platform allows it to consolidate data from public sources such as embracing faces, allowing it to surface more practical intelligence on emerging machine learning security risks.

Finally, AISEC Platform 2.0 also provides access to updated dashboards that allow for deeper runtime analysis by providing greater visibility into rapid injection attempts, misuse patterns, and agent behavior.

Soon, HiddenLayer is also working on adding support for AI agents built on top of the AI ​​model.
In general, it is clear that AI model providers are much more focused on performance and accuracy than on security, Sestito said. The AI ​​model is inherently vulnerable, he added, despite the guardrails that may have been introduced.

If AI agents are allowed to access large-scale data, applications, and services, that problem will become even more problematic, Sestito noted. These AI agents are, in fact, a new type of identity that cybercriminals will undoubtedly find ways to compromise, he added.

But despite these concerns, organizations continue to deploy AI technology to ensure their cybersecurity teams are ultimately asked to ensure, Sestito said.

AI is not the first emerging technology that cybersecurity teams have been asked to secure after they have already been adopted, but the level of potential damage that could be inflicted by violations of AI models or agents can be devastating. There is greater awareness about this issue today than last year’s period, but it has become clear that there is a lot to be clear about the need to secure AI technology.

Of course, there is a limited number of cybersecurity experts with AI expertise, and a limited number of AI experts with even fewer cybersecurity concerns. Thus, it may not be as much a matter of whether there are major AI security incidents as possible as what harm is done before more attention is paid.

author avatar
versatileai
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleFirefly brings AI-powered content creation to your creative cloud and marketing workflows
Next Article AI hits humanity, the environment remains dangerously unclear, government agencies warn
versatileai

Related Posts

Research

In the midst of intense AI talent races, Meta’s active recruitment target open-rai researcher

June 30, 2025
Research

Lossless compression tailored to AI

June 30, 2025
Research

High-tech research jobs in the US will rise by 26% by the next decade. Median future salary for AI, ML and others is $140,000

June 30, 2025
Add A Comment
Leave A Reply Cancel Reply

Top Posts

New Star: Discover why 보니 is the future of AI art

February 26, 20252 Views

Impact International | EU AI ACT Enforcement: Business Transparency and Human Rights Impact in 2025

June 2, 20251 Views

Presight plans to expand its AI business internationally

April 14, 20251 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

New Star: Discover why 보니 is the future of AI art

February 26, 20252 Views

Impact International | EU AI ACT Enforcement: Business Transparency and Human Rights Impact in 2025

June 2, 20251 Views

Presight plans to expand its AI business internationally

April 14, 20251 Views
Don't Miss

AI Art Generation Using Primo Models: Unlock Creative Business Opportunities in 2024 | AI News Details

July 5, 2025

Benchmarks for speech models from wild text

July 5, 2025

Creating innovative content at your fingertips

July 4, 2025
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2025 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?