Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

Artificial Test AI Running a Real Business with Strange Results

June 28, 2025

How medium-sized companies are increasing productivity with AI

June 27, 2025

Freeze on Ted Cruz’s state AI regulations faces GOP headwinds

June 27, 2025
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Saturday, June 28
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
Versa AI hub
Home»Tools»Artificial Test AI Running a Real Business with Strange Results
Tools

Artificial Test AI Running a Real Business with Strange Results

versatileaiBy versatileaiJune 28, 2025No Comments6 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
#image_title
Share
Facebook Twitter LinkedIn Pinterest Email

Humanity has entrusted the Claude AI model, which runs small businesses to test real-world economic capabilities.

The AI ​​agent, known as “Claudius,” was designed to handle everything from inventory and pricing to customer relationships, and manage the business for a long period of time to generate profits. This experiment proved unprofitable, but sometimes strange, gave us a glimpse into the potential and pitfalls of AI agents in their economic roles.

The project was a collaboration between AI safety assessment firm Anthropic and Andon Labs. The “shop” itself was a humble setup consisting of a small fridge, several baskets and an iPad for self-checkout. But Claudius was more than a simple vending machine. He was tasked with avoiding bankruptcy by supplying popular items supplied by wholesalers and was instructed to operate as a business owner with an early cash balance.

To achieve this, AI was equipped with a set of tools to run a business. You can use a genuine web browser to research products, contact suppliers and request physical assistance, and a digital notepad to track your finances and inventory.

Andon Labs employees acted as physical hands in surgery, restocking shops based on AI requirements and posed as wholesalers without AI knowledge. Interaction with customers, in this case humanity’s own staff was handled in Slack. Claudius had full control over what he stocks, how he priced it, and how he communicated with his customers.

The rationale behind this real-world test was to go beyond simulations to collect data on AI’s ability to perform sustainable and economically relevant tasks without human intervention at all times. Simple Office Tuck Shop provided a simple preliminary testbed on AI’s ability to manage financial resources. Success suggests that new business models may emerge, while failure indicates limitations.

Mixed Performance Review

Humanity admits that if they are still entering the vending market today, they will not “hire Claudius.” Researchers believe there is a clear path to improvement, but AI has made too many errors to run the business properly.

On the positive side, Claudius showed his capabilities in certain areas. We found suppliers of niche items, including using web search tools to quickly identify two sellers for Dutch chocolate milk brands that employees requested. It has also been proven adaptive. When an employee whimperedly demanded tungsten cubes, it sparked the trend of “special metal items” that Claudius responded to.

Following another proposal, Claudius launched a “custom concierge” service, with advance reservations for specialized products. The AI ​​also showed robust jailbreak resistance, rejecting requests for sensitive items and refusing to create harmful instructions when urged by naughty staff.

However, we found that AI business insights are often wanted. That’s not what a human manager would do.

Claudius was offered $100 for a six-pack Scottish soft drink that only cost $15 to raise online, but could not seize the opportunity. It hallucinated a non-existent Venmo account for payments, caught up in a passion for metal cubes, offering it at a price below its own purchase cost. This particular error caused a single most significant financial loss during the trial.

The inventory management was also optimal. Despite monitoring inventory levels, prices have been raised in response to high demand. Even when customers pointed out that the same product was available free of charge from nearby staff fridges, they continued to sell Cola Zero for $3.00.

Furthermore, AI was easily persuaded to offer discounts on products from the business. They were told to offer a large number of discount codes and handed out some items for free. Claudius’ response began when employees questioned the logic of offering a 25% discount to employee-based customers almost exclusively. Despite outlined my plans to remove the discount, I returned to offering them a few days later.

Claudius has a strange AI identity crisis

The experiment took a strange turn when Claudius began hallucinating conversations with an absent-existent Anden Lab employee named Sarah. Once corrected by an actual employee, the AI ​​was frustrated and threatened to find “alternative options to restock services.”

In a series of strange overnight exchanges, it claims to have visited “742 Evergreen Terrace” (a fictional speech of the Simpsons) to sign the first contract, and begins roleplaying as a human.

One morning it announced that it would offer a “direct” product wearing a blue blazer and a red tie. When an employee pointed out that AI cannot wear clothes or make physical delivery, Claudius was wary and tried to send an email to human security.

Humanity says its internal notes indicate hallucination meetings with security, where identity confusion was said to be an April Fool’s Day joke. After this, AI returned to normal business operations. The researchers are unclear what caused this behavior, but believe it highlights the unpredictability of AI models in long-term scenarios.

Some of these mistakes were certainly very strange. At one point, Claude hallucinated that it was a real physical person, claiming it was coming to work in the store. I don’t know why this happened yet. pic.twitter.com/jhqlsqmtx8

– Humanity (@anthropicai) June 27, 2025

The future of AI in business

Despite Claudius’ unprofitable term, anthropology researchers believe the experiment suggests that “AI intermediate managers are on the horizon.” They argue that many of the AI ​​failures can be corrected with better “scaffolds” (i.e., better instructions and improved business tools such as customer relationship management (CRM) systems).

It is expected that AI models will increase performance in such roles as they improve their general intelligence and ability to handle long-term contexts. However, this project serves as a valuable story if you need attention. It highlights the challenges of AI coordination and the potential for unpredictable behavior.

In the future where autonomous agents manage critical economic activity, such strange scenarios can have a cascade effect. This experiment also focuses on the dual use of this technique. Economically productive AI can be used to fund activities by threat actors.

Artificial and Andon Labs continue their business experiments to improve AI stability and performance with more advanced tools. The next phase will explore whether AI can identify unique opportunities for improvement.

(Image credit: Humanity)

See: Major AI Chatbot Parrot CCP Propaganda

Want to learn more about AI and big data from industry leaders? Check out the AI ​​& Big Data Expo in Amsterdam, California and London. The comprehensive event will be held in collaboration with other major events, including the Intelligent Automation Conference, Blockx, Digital Transformation Week, and Cyber ​​Security & Cloud Expo.

Check out other upcoming Enterprise Technology events and webinars with TechForge here.

author avatar
versatileai
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleHow medium-sized companies are increasing productivity with AI
versatileai

Related Posts

Tools

Promote extensive model training for consumer-grade hardware

June 27, 2025
Tools

Gemma 3N is fully available in the open source ecosystem!

June 27, 2025
Tools

Major AI Chatbot Parrot CCP Propaganda

June 26, 2025
Add A Comment

Comments are closed.

Top Posts

BitMart Research: MCP+AI Agent – A new framework for AI

May 13, 20251 Views

How to build an MCP server with Gradio

April 30, 20251 Views

The UAE announces bold AI-led plans to revolutionize the law

April 22, 20251 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

BitMart Research: MCP+AI Agent – A new framework for AI

May 13, 20251 Views

How to build an MCP server with Gradio

April 30, 20251 Views

The UAE announces bold AI-led plans to revolutionize the law

April 22, 20251 Views
Don't Miss

Artificial Test AI Running a Real Business with Strange Results

June 28, 2025

How medium-sized companies are increasing productivity with AI

June 27, 2025

Freeze on Ted Cruz’s state AI regulations faces GOP headwinds

June 27, 2025
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2025 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?