Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

Gemini 3.1 Flash TTS: New Text-to-Speech AI Model

April 15, 2026

Agricultural drones are getting smarter for large farms

April 15, 2026

New AI models for the agent era

April 14, 2026
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Friday, April 17
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources
Versa AI hub
Home»Tools»Workplace learning for AI agents
Tools

Workplace learning for AI agents

versatileaiBy versatileaiApril 8, 2026No Comments6 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
#image_title
Share
Facebook Twitter LinkedIn Pinterest Email

Most AI agents reread transcripts instead of learning principles, making mistakes repeatedly and failing to transfer lessons to new situations. ALTK‑Evolve converts raw agent trajectories into reusable guidelines. In our benchmarks, this approach improved reliability without bloating the context, especially on hard (Δ 14.2% on AppWorld) multi-step tasks.

“Eternal Intern” Problem

Imagine a great cook who knows every cookbook by heart but forgets about the kitchen every morning. They don’t remember that your oven gets hot or that patrons like extra salt. I follow the recipe card, but when I run out of lemons, it freezes. This is most AI agents. They are good at following prompts, but bad at accumulating knowledge about their environment. If I put yesterday’s log back into the prompt, it just rereads the history. It doesn’t help them generalize from there.

Junior needs different recipes for “vinaigrette” and “duck orange”. Chefs learn that “acid balances fat” and apply it everywhere. Similarly, a reliable agent must extract principles from experience and apply them to new tasks, not just those that are close to replicating old tasks. This long-term memory subsystem does just that. Convert interaction traces into candidate guidelines and filter quality to insert only guidance relevant to the moment of action. Agents need principles, not records.

A recent MIT study found that 95% of pilots fail because agents don’t adapt and learn in the field. ALTK-Evolve uses long-term episodic memory to address this learning gap, allowing agents to reason better.

Solution: Long Term Memory with ALTK-Evolve

Evolve is a memory system for AI agents that helps them improve over time by learning from and using guidelines generated from previous runs.

Operationally, the system runs as a continuous loop.

Downward flow (observation and extraction): Capture the complete agent trajectory (user utterances, thoughts, tool calls, and outcomes) at the interaction layer (such as Langfuse or another OpenTelemetry-based observability tool). Pluggable extractors mine traces of structural patterns and persist them as candidate entities. Upward Flow (Refinement and Retrieval): Background integration and scoring jobs merge duplicates, remove weak rules, and strengthen proven strategies to evolve a high-quality library of entities such as guidelines, policies, and SOPs. Retrieval retrieves only the relevant items through the interaction layer and returns them to the application layer context.

This approach works for several main reasons.

Teach judgment: Transform one-time events into portable strategies that can be transferred between tasks. Control Noise: Scoring keeps your memory lean and useful and prevents your junk drawer from growing. Progressive disclosure: Search is just-in-time and doesn’t cram everything into context.

The result: increased reliability, especially in difficult tasks.

We evaluated AppWorld’s framework. There, agents complete realistic multi-step tasks through APIs, with an average of 9.5 APIs used per 1.8 apps, but there are also hard cases that require more complex control flows. The ReAct agent received task instructions and the top five retrieval guidelines generated in a previous run (train/dev) and tested on an unseen partition (test-normal). Report Scenario Objective Completion (SGC). This is a strict consistency measure that requires success across variants.

Difficulty Baseline SGC + Memory Δ Easy 79.0% 84.2% +5.2 Medium 56.2% 62.5% +6.3 Hard 19.1% 33.3% +14.2 Overall 50.0% 58.9% +8.9

Key conclusions from the evaluation include:

Generalization: Agent improves invisible Test-Normal tasks. This indicates that the agent is learning principles rather than memorizing recipes. Complexity scaling: The more difficult the task, the more easily the agent can benefit from learned guidelines, with more difficult tasks having the greatest benefit. Hard tasks had a 74% relative increase in success rate. Guidelines help navigate complex control flows. Consistency: The SGC improvement exceeded the actual pass rate improvement, reducing “erratic” behavior across scenario variations. Guidelines not only help agents solve tasks, they also help ensure that tasks are solved across variants.

For more details on the experiment, please see the paper at https://arxiv.org/abs/2603.10600.

Introduction (Please select a path)

You can choose how ALTK‑Evolve is integrated into your agent.

No Code (Lite Mode) by Claude Code, Codex, IBM Bob

Install the plugin in Claude Code.

Claude Plugin Marketplace Add AgentToolkit/altk-evolve Install Claude Plugin evolve-lite@evolve-marketplace

that’s it! The plugin extracts entities from the trajectory and saves them as files on the file system. Use Claude Code’s hook for automatic retrieval.

Prefer watching than reading? Watch a short tutorial (video) on Evolve-Lite Claude Code: Demo

For an example of how to learn using Claude Code in Lite mode, check out the walkthrough here.

Light mode is easy to test, but it has limitations. For example, it doesn’t collect insights from the entire agent session or perform entity consolidation or garbage collection. The low-code and pro-code versions below address these limitations.

There is also one-step integration with Codex and IBM Bob. Please try it!

Low-code with ReAct agent

Add a single altk_evolve.auto import, invert the flag, and output the trace to the Arize Phoenix UI. Then, synchronize the traces to generate improvement guidelines without changing the current stack. It works with popular LLM client and agent frameworks (OpenAI, LiteLLM, Hugging Face agent, etc.), so you can maintain your current stack and gain easy visibility.

To see how easily this fits into your existing projects, explore our hands-on examples that showcase the integration of different frameworks. See the Low-Code Tracing documentation for configuration and feature details.

Procode using CUGA

We integrated ALTK‑Evolve directly into CUGA via MCP to create a tight, low-overhead learning loop. The get_guidelines MCP tool is called before each run to reveal task-specific steering and reduce trial and error. After execution, CUGA sends back a structured execution trace via save_trajectory so Evolve can learn from what actually happened and improve future guidance. The result is improved integration over time while maintaining transparency, configurability, and ease of deployment.

Would you like a visual tour? Check out our CUGA integration tutorial: Video

Try it and let your agents know what you learn

Agents shouldn’t wake up every morning as interns. This approach helps you learn on the job. If you use Claude Code, Codex, or IBM Bob, try it out in a few minutes and see how it improves your agents.

Starring a repository helps others discover your project and gives you direct guidance on what to build next.

Watch the demo

Claude Code Walkthrough (Video): Demo OpenAI Codex Walkthrough (Video): Demo IBM Bob Demo Walkthrough (Video): Demo CUGA Integration Walkthrough: Video

author avatar
versatileai
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous Article2.0 Flash, Flash-Lite, Pro Experimental Version
Next Article The need for AI workflows and monitoring for software developers
versatileai

Related Posts

Tools

Gemini 3.1 Flash TTS: New Text-to-Speech AI Model

April 15, 2026
Tools

Agricultural drones are getting smarter for large farms

April 15, 2026
Tools

New AI models for the agent era

April 14, 2026
Add A Comment

Comments are closed.

Top Posts

How to save millions of online casinos with artificial intelligence -5 important ways

January 24, 20254 Views

‘Junk science’ fabricated by AI floods Google Scholar, researchers warn

January 13, 20254 Views

Agricultural drones are getting smarter for large farms

April 15, 20263 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

How to save millions of online casinos with artificial intelligence -5 important ways

January 24, 20254 Views

‘Junk science’ fabricated by AI floods Google Scholar, researchers warn

January 13, 20254 Views

Agricultural drones are getting smarter for large farms

April 15, 20263 Views
Don't Miss

Gemini 3.1 Flash TTS: New Text-to-Speech AI Model

April 15, 2026

Agricultural drones are getting smarter for large farms

April 15, 2026

New AI models for the agent era

April 14, 2026
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2026 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?