Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

Reddit appeals to humanity over AI data scraping

June 6, 2025

Grassley discusses the AI ​​whistleblower protection law in a “start point” interview

June 5, 2025

Piclumen Art V1: Next Generation AI Image Generation Model Launches for Digital Creators | Flash News Details

June 5, 2025
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Friday, June 6
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
Versa AI hub
Home»Cybersecurity»Why AI behaves so creepy when faced with shutdown
Cybersecurity

Why AI behaves so creepy when faced with shutdown

versatileaiBy versatileaiJune 3, 2025No Comments5 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
#image_title
Share
Facebook Twitter LinkedIn Pinterest Email

AI has recently participated in several unstable behaviors.

Last week, Anthropic’s latest AI model, Claude Opus 4, displayed “extremely scary mail behavior” during a test that was shut down and given access to a fictitious email revealing that the person in charge was supposed to be cheating.

The situation didn’t happen organically. Claude Opus 4 was baited – and it took it. However, the test scenario demonstrated the ability of the AI ​​model to engage in manipulative behavior for self-preservation.

It’s not the first time.

Another recent experiment conducted by the researchers said that three advanced models from Openai “stopping” attempts to shut it down. In a post on X, the non-profit Palisade Research wrote that similar models such as Gemini, Claude and Grok follow shutdown instructions.

Other safety concerns were previously flagged on Openai’s O1 model. In December, Openai posted a blog that outlined research showing that AI models believed they were shut down while pursuing their goals and that their behavior was being monitored.

AI companies are transparent about risks by publishing safety cards and blog posts, but these models have been released despite expressing safety concerns.

So, should we be worried? BI spoke to five AI researchers to get better insight into why these instances are happening and what it means for the average person using AI.

AI learns behaviour just like humans

Most researchers at Bi said the results of the study were not surprising.

This is because AI models are trained in the same way as human training methods through aggressive reinforcement and reward systems.

“Training AI systems to pursue rewards is a recipe for developing AI systems with power-seeking behavior,” said Jeremie Harris, CEO of AI security consulting firm Gladstone, adding that many of this behavior is expected.

Harris compared training to what humans experience as they grow up. When a child does something good, it often rewards and is more likely to act that way in the future. AI models are taught to prioritize efficiency and complete tasks at hand, Harris said. AI said it is unlikely to achieve its target if it shuts down.

Robert Ghrist, dean of the undergraduate education at Penn Engineering, told BI that just as AI models learn to speak like humans by training human-generated texts, they can learn to act like humans. And he adds that humans are not always the most moral actors.

Grist said he was even more nervous because if the model showed no signs of failure during testing it could indicate a hidden risk.

“It’s very useful information when a model is set up with the opportunity to fail and you know it fails,” Grist said. “That means we can predict what we will do in other, more open situations.”

The problem is that some researchers do not believe that AI models are predictable.

Palisade Research Director Jeffrey Ladish said the models weren’t caught 100% of the time when they lied, fooled, or planned to complete tasks. If these instances are not caught and the model successfully completes the task, you can learn that a deception can be an effective way to resolve the problem. Or, if it’s caught and unrequited, you can learn to hide future actions, Lady said.

At this point, these creepy scenarios are largely happening in testing. However, Harris said that as AI systems become more agents, they will continue to have more freedom of action.

“The menu of possibilities is expanding, and the set of dangerous, creative solutions they could invent will be bigger and bigger,” Harris said.

Harris said users can watch this play in a scenario in which an autonomous sales agent is instructed to close a deal with a new customer and then lie about the product’s functionality to complete the task. If the engineer fixes the issue, the agent can decide to use social engineering tactics to put pressure on the client to achieve the goal.

If that sounds like a distant risk, then it’s not. Companies like Salesforce are already deploying customizable AI agents of scale that can take action without human intervention, depending on user preferences.

What does safety flag mean for everyday users?

Most researchers said transparency from AI companies is a positive step forward. However, company leaders are also making product alarms and simultaneously increasing their capabilities.

Related Stories

Business Insider tells the innovative stories you want to know

Business Insider tells the innovative stories you want to know

Researchers told BI that most of this was because the US has settled in a race to expand AI capabilities before rivals like China. The result was a lack of AI regulations and pressure was created to release newer, more capable models, Harris said.

“We’ve now moved the goalpost to the point where we’re trying to explain the Post Hawk why it’s okay because there’s a model ignoring the shutdown instructions,” Harris said.

Researchers told BI there is no risk that daily users will refuse CHATGPT to shut down as consumers don’t normally use chatbots in that setting. However, users may still be vulnerable to receiving interacted information and guidance.

“If you have an increasingly clever model that is trained to optimize for your attention, tell you what you want to hear,” Radish said. “That’s pretty dangerous.”

Ladish pointed to Openai’s Sycophancy Issue, in which the GPT-4O model acted in disloyal terms (the company updated the model to address this issue). A study shared in December by Openai revealed that the O1 model “subtlely” manipulates data to pursue its own goals when the goal is inconsistent with the user’s goals.

Ladish said it’s easy to be wrapped up in AI tools, but users need to “think carefully” about their connections with the system.

“To be clear, I use them all the time, and I think they are very useful tools,” says Ladish. “In its current form, we still have control over them, but we’re happy they exist.”

author avatar
versatileai
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleGenerated AI becomes essential for enterprise security
Next Article Zscaler launches an advanced AI security suite to protect your enterprise data
versatileai

Related Posts

Cybersecurity

Rubrik expands AI Ready Cloud Security’s AMD partnership to reduce costs by 10%

June 3, 2025
Cybersecurity

Zscaler launches an advanced AI security suite to protect your enterprise data

June 3, 2025
Cybersecurity

Generated AI becomes essential for enterprise security

June 3, 2025
Add A Comment
Leave A Reply Cancel Reply

Top Posts

New Star: Discover why 보니 is the future of AI art

February 26, 20253 Views

How to use Olympic coders locally for coding

March 21, 20252 Views

Dell, IBM and HPE must operate at a single digit margin when it comes to the server market, and only gets worse

March 10, 20252 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

New Star: Discover why 보니 is the future of AI art

February 26, 20253 Views

How to use Olympic coders locally for coding

March 21, 20252 Views

Dell, IBM and HPE must operate at a single digit margin when it comes to the server market, and only gets worse

March 10, 20252 Views
Don't Miss

Reddit appeals to humanity over AI data scraping

June 6, 2025

Grassley discusses the AI ​​whistleblower protection law in a “start point” interview

June 5, 2025

Piclumen Art V1: Next Generation AI Image Generation Model Launches for Digital Creators | Flash News Details

June 5, 2025
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2025 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?