Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

New model design could solve high AI costs for enterprises

November 5, 2025

‘Sounds like science fiction’ — but Ohio’s AI marriage ban may not be so far-fetched

November 5, 2025

Toronto Beacon Software raises $250 million to accelerate AI rollup strategy

November 5, 2025
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Wednesday, November 5
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources
Versa AI hub
Home»Tools»New model design could solve high AI costs for enterprises
Tools

New model design could solve high AI costs for enterprises

versatileaiBy versatileaiNovember 5, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
#image_title
Share
Facebook Twitter LinkedIn Pinterest Email

Business leaders grappling with the steep costs associated with deploying AI models may find a reprieve thanks to a new architectural design.

Although the capabilities of generative AI are attractive, the enormous amount of computation required for both training and inference increases cost-prohibitive costs and environmental concerns. At the heart of this inefficiency is a “fundamental bottleneck” in the model: an autoregressive process that sequentially generates text for each token.

For companies processing massive data streams, from IoT networks to financial markets, this limitation makes producing long-form analysis slow and economically difficult. But a new research paper by Tencent AI and Tsinghua University suggests an alternative.

A new approach to AI efficiency

In this study, a continuous autoregressive language model (CALM) is introduced. This method redesigns the generation process to predict continuous vectors instead of discrete tokens.

High-fidelity autoencoders “compress chunks of K tokens into a single continuous vector” and preserve higher semantic bandwidth.

Instead of processing “the”, “cat”, “sat”, etc. in three steps, the model compresses them into one step. This design directly “reduces the number of generation steps” and attacks the computational load.

Experimental results show a better performance-computing trade-off. The CALM AI model, which grouped four tokens, provided the company with performance “comparable to a strong discrete baseline, but at significantly lower computational cost.”

For example, one CALM model required 44 percent fewer training FLOPs and 34 percent fewer inference FLOPs than a similarly functional baseline Transformer. This represents savings in both the initial capital expenditure of training and the recurring operating costs of inference.

Rebuild the continuous domain toolkit

Moving from a finite, discrete vocabulary to an infinite, continuous vector space breaks the standard LLM toolkit. The researchers needed to develop a “comprehensive likelihood-free framework” to make their new model workable.

For training, the model cannot use standard softmax layers or maximum likelihood estimation. To solve this, the team used an “unlikelihood” objective using Energy Transformer. This rewards the model for accurate predictions without calculating explicit probabilities.

This new training method also required new evaluation metrics. Standard benchmarks such as Perplexity are not applicable because they rely on the same likelihood that the model no longer computes.

The team proposed BrierLM, a new metric based on Brier scores that can be estimated purely from model samples. Validation confirmed that BrierLM is a reliable alternative, showing a Spearman rank correlation of -0.991 with traditional loss metrics.

Finally, the framework restores controlled generation, a key feature for enterprise applications. Standard temperature sampling is not possible without a probability distribution. This paper introduces a new “unlikelihood sampling algorithm” that includes a practical batch approximation method to manage the trade-off between output accuracy and diversity.

Reduce AI costs for enterprises

This research offers a glimpse into a future where generative AI is not defined purely by an ever-increasing number of parameters, but by the efficiency of its architecture.

Current paths to scaling models have hit a wall of diminishing returns and increasing costs. The CALM framework establishes “a new design axis for LLM scaling: increasing the semantic bandwidth of each generation step.”

Although it is a research framework and not an off-the-shelf product, it represents a powerful and scalable path towards ultra-efficient language models. When evaluating a vendor’s roadmap, technology leaders should look beyond model size and start thinking about architectural efficiency.

The ability to reduce FLOPs per generated token is a decisive competitive advantage, enabling more economical and sustainable deployment of AI to reduce costs across the enterprise, from the data center to data-intensive edge applications.

See also: Flawed AI benchmarks put corporate budgets at risk

Want to learn more about AI and big data from industry leaders? Check out the AI ​​& Big Data Expos in Amsterdam, California, and London. This comprehensive event is part of TechEx and co-located with other major technology events such as Cyber ​​Security Expo. Click here for more information.

AI News is brought to you by TechForge Media. Learn about other upcoming enterprise technology events and webinars.

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous Article‘Sounds like science fiction’ — but Ohio’s AI marriage ban may not be so far-fetched
versatileai

Related Posts

Tools

Public replication of state-of-the-art visual language models

November 5, 2025
Tools

Flaws in AI benchmarks put company budgets at risk

November 4, 2025
Tools

Introducing SafeCoder

November 4, 2025
Add A Comment

Comments are closed.

Top Posts

Bending Spoons’ acquisition of AOL shows the value of legacy platforms

October 30, 20257 Views

Build a healthcare robot from simulation to deployment with NVIDIA Isaac

October 30, 20256 Views

CEO of stablecoin giant Circle says international law needs to be updated for a “machine-governed economic system”

October 28, 20256 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

Bending Spoons’ acquisition of AOL shows the value of legacy platforms

October 30, 20257 Views

Build a healthcare robot from simulation to deployment with NVIDIA Isaac

October 30, 20256 Views

CEO of stablecoin giant Circle says international law needs to be updated for a “machine-governed economic system”

October 28, 20256 Views
Don't Miss

New model design could solve high AI costs for enterprises

November 5, 2025

‘Sounds like science fiction’ — but Ohio’s AI marriage ban may not be so far-fetched

November 5, 2025

Toronto Beacon Software raises $250 million to accelerate AI rollup strategy

November 5, 2025
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2025 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?