Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

Extending AI value beyond pilot purgatory

January 21, 2026

Hyundai Motor accelerates new AI business…runs a corporate planning committee reporting directly to the vice chairman

January 20, 2026

Business News | How AI will transform digital marketing and why companies are turning to AI SEO services

January 20, 2026
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Wednesday, January 21
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources
Versa AI hub
Home»Tools»New model design could solve high AI costs for enterprises
Tools

New model design could solve high AI costs for enterprises

versatileaiBy versatileaiNovember 5, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
#image_title
Share
Facebook Twitter LinkedIn Pinterest Email

Business leaders grappling with the steep costs associated with deploying AI models may find a reprieve thanks to a new architectural design.

Although the capabilities of generative AI are attractive, the enormous amount of computation required for both training and inference increases cost-prohibitive costs and environmental concerns. At the heart of this inefficiency is a “fundamental bottleneck” in the model: an autoregressive process that sequentially generates text for each token.

For companies processing massive data streams, from IoT networks to financial markets, this limitation makes producing long-form analysis slow and economically difficult. But a new research paper by Tencent AI and Tsinghua University suggests an alternative.

A new approach to AI efficiency

In this study, a continuous autoregressive language model (CALM) is introduced. This method redesigns the generation process to predict continuous vectors instead of discrete tokens.

High-fidelity autoencoders “compress chunks of K tokens into a single continuous vector” and preserve higher semantic bandwidth.

Instead of processing “the”, “cat”, “sat”, etc. in three steps, the model compresses them into one step. This design directly “reduces the number of generation steps” and attacks the computational load.

Experimental results show a better performance-computing trade-off. The CALM AI model, which grouped four tokens, provided the company with performance “comparable to a strong discrete baseline, but at significantly lower computational cost.”

For example, one CALM model required 44 percent fewer training FLOPs and 34 percent fewer inference FLOPs than a similarly functional baseline Transformer. This represents savings in both the initial capital expenditure of training and the recurring operating costs of inference.

Rebuild the continuous domain toolkit

Moving from a finite, discrete vocabulary to an infinite, continuous vector space breaks the standard LLM toolkit. The researchers needed to develop a “comprehensive likelihood-free framework” to make their new model workable.

For training, the model cannot use standard softmax layers or maximum likelihood estimation. To solve this, the team used an “unlikelihood” objective using Energy Transformer. This rewards the model for accurate predictions without calculating explicit probabilities.

This new training method also required new evaluation metrics. Standard benchmarks such as Perplexity are not applicable because they rely on the same likelihood that the model no longer computes.

The team proposed BrierLM, a new metric based on Brier scores that can be estimated purely from model samples. Validation confirmed that BrierLM is a reliable alternative, showing a Spearman rank correlation of -0.991 with traditional loss metrics.

Finally, the framework restores controlled generation, a key feature for enterprise applications. Standard temperature sampling is not possible without a probability distribution. This paper introduces a new “unlikelihood sampling algorithm” that includes a practical batch approximation method to manage the trade-off between output accuracy and diversity.

Reduce AI costs for enterprises

This research offers a glimpse into a future where generative AI is not defined purely by an ever-increasing number of parameters, but by the efficiency of its architecture.

Current paths to scaling models have hit a wall of diminishing returns and increasing costs. The CALM framework establishes “a new design axis for LLM scaling: increasing the semantic bandwidth of each generation step.”

Although it is a research framework and not an off-the-shelf product, it represents a powerful and scalable path towards ultra-efficient language models. When evaluating a vendor’s roadmap, technology leaders should look beyond model size and start thinking about architectural efficiency.

The ability to reduce FLOPs per generated token is a decisive competitive advantage, enabling more economical and sustainable deployment of AI to reduce costs across the enterprise, from the data center to data-intensive edge applications.

See also: Flawed AI benchmarks put corporate budgets at risk

Want to learn more about AI and big data from industry leaders? Check out the AI ​​& Big Data Expos in Amsterdam, California, and London. This comprehensive event is part of TechEx and co-located with other major technology events such as Cyber ​​Security Expo. Click here for more information.

AI News is brought to you by TechForge Media. Learn about other upcoming enterprise technology events and webinars.

author avatar
versatileai
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous Article‘Sounds like science fiction’ — but Ohio’s AI marriage ban may not be so far-fetched
Next Article Pay with your AWS account
versatileai

Related Posts

Tools

Extending AI value beyond pilot purgatory

January 21, 2026
Tools

Use AI to understand the universe more deeply

January 20, 2026
Tools

SAP and Fresenius build a sovereign AI backbone for healthcare

January 19, 2026
Add A Comment

Comments are closed.

Top Posts

How OSTP’s Kratsios sees the future of U.S. AI law and NIST’s role

January 16, 20268 Views

AI-powered data security: threat detection and enhanced privacy

February 12, 20256 Views

Use Together AI to fine-tune LLM from Hugging Face Hub

January 19, 20265 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

How OSTP’s Kratsios sees the future of U.S. AI law and NIST’s role

January 16, 20268 Views

AI-powered data security: threat detection and enhanced privacy

February 12, 20256 Views

Use Together AI to fine-tune LLM from Hugging Face Hub

January 19, 20265 Views
Don't Miss

Extending AI value beyond pilot purgatory

January 21, 2026

Hyundai Motor accelerates new AI business…runs a corporate planning committee reporting directly to the vice chairman

January 20, 2026

Business News | How AI will transform digital marketing and why companies are turning to AI SEO services

January 20, 2026
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2026 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?