Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

Start building with Nano Banana 2 Lite and Gemini Omni Flash

July 1, 2026

Wimbledon adds IBM AI tools for live match coverage

June 30, 2026

Achieve density and score across distributions with one transformer

June 30, 2026
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Wednesday, July 1
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
  • Resources
Versa AI hub
Home»Tools»Achieve density and score across distributions with one transformer
Tools

Achieve density and score across distributions with one transformer

versatileaiBy versatileaiJune 30, 2026No Comments5 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
#image_title
Share
Facebook Twitter LinkedIn Pinterest Email

📄 Technical report: arxiv.org/abs/2511.05924

DiScoFormer One Transformer for density and score across distributions - Google -image-1

Many problems in machine learning and science boil down to the same task. In other words, we have a collection of data points and we want to recover the distribution they come from, i.e. which values ​​are common and which are rare. Determining its distribution means estimating two quantities. One is the density of the distribution, the other is the density of the distribution, and the score becomes more useful as the dimensionality increases. Density is a smoother version of a histogram, higher when points are closer together and lower when there are fewer points. The score (log density slope) refers to the direction in which density increases fastest. As you move the points along the score, you move towards more likely areas.

Diffusion-based generative models (the technology behind AI image generators like Stable Diffusion and DALL-E) start with random noise and turn that noise into realistic images according to an iterative score. The same score drives Bayesian sampling and particle simulations used to model systems such as plasmas.

Extracting density and scores from finite samples is difficult, and today’s tools force trade-offs between generalizability and accuracy. One classic approach, kernel density estimation (KDE), calculates density from data points around any location. The closer and more numerous the data points are, the higher the density. It requires no training and can be applied to any distribution, but accuracy decreases rapidly as dimensionality increases. Alternatively, neural score matching models trained to predict scores remain accurate even in high dimensions, but each must learn the distribution, and different models must be retrained from scratch.

We introduce a new solution called DiScoFormer (Density and Score Transformer). This is one of those models that, given a set of data points, estimates both the density and score of a distribution in a single forward pass without retraining.

Train a transformer for density and score estimation

DiScoFormer One Transformer for density and score across distributions - Google -image-2

DiScoFormer uses stacked layers of transblocks to map the entire sample to the density and score of the underlying distribution. This model utilizes cross-attention, so you can evaluate density and score at any point in time, not just where the data is. There is a mathematical relationship between score and density. The score is the slope of the logarithm of the density. We exploit this by having a shared backbone with two output heads, one for density and one for score.

This binding does more than just store parameters. Since the score head must match the slope of the log density head for each query, gaps between them lead to label-free inconsistency. Use this during inference. Fix the context and perform some gradient steps on its consistency loss. DiScoFormer then adapts to out-of-distribution input on the fly, without the need for ground truth density or scores.

There are mathematical reasons why transformer architecture is suited to this task. The kernel density estimation has a single bandwidth, and how far the influence of each point reaches is fixed in advance and applied equally everywhere. Attention is a strict generalization of that. Since we analytically show that the weights of a single attention head are approximately Gaussian kernels across the data, one cross-attention block can already reproduce the density and score of KDE. From there, the model goes further, learning multiple such scales at once and adapting them to the data. DiScoFormer does not abandon the classic black box approach, but instead incorporates and improves on KDE as a special case.

What data did you use to train DiScoFormer? We relied on a Gaussian mixture model for two main reasons. First, the GMM is a general-purpose density approximator with enough components to match essentially any smooth distribution to arbitrarily small errors. Second, GMM has a closed-form density and score, so there is always a precise target to monitor. We employ both of these properties by drawing a new GMM every batch, giving the model virtually unlimited examples of target distributions, and monitoring each for the exact density and score of a given GMM.

performance

Overall, DiScoFormer outperformed KDE in both density and score estimation, widening the gap in the very areas where KDE struggles. In 100 dimensions, this is not even close. Compared to the best manually tuned KDE, it reduces score error by about 6.5x and density error by over 37x, and continues to improve as you add samples while KDE is out of memory. It also moves far beyond the range of the training data and maintains accuracy with larger mixtures of modes than previously seen during training, as well as non-Gaussian shapes such as Laplace and Student’s t. The main advantage of KDE is speed, especially when the dataset is small.

What we find most exciting about DiScoFormer is that score estimation is a dependency shared across many fields, including generative modeling, Bayesian inference, and scientific computing. You can reduce costs all at once with pre-trained plugin estimators that maintain accuracy in high dimensions and eliminate the need to retrain for each problem. One model is reused everywhere scores and densities are displayed.

For more information, we recommend reading our technical report.

author avatar
versatileai
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleHP accelerates enterprise workflows with OpenAI Frontier
Next Article Wimbledon adds IBM AI tools for live match coverage
versatileai

Related Posts

Tools

Start building with Nano Banana 2 Lite and Gemini Omni Flash

July 1, 2026
Tools

Wimbledon adds IBM AI tools for live match coverage

June 30, 2026
Tools

HP accelerates enterprise workflows with OpenAI Frontier

June 29, 2026
Add A Comment

Comments are closed.

Top Posts

Top 5 NSFW AI Generators for Surreal NSFW AI Art in 2025

August 20, 20254 Views

Practical 3D Asset Generation: A Step-by-Step Guide

November 16, 20253 Views

Shutterstock pioneers “research license” model with Lightricks, lowering barriers to AI training data

December 13, 20243 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

Top 5 NSFW AI Generators for Surreal NSFW AI Art in 2025

August 20, 20254 Views

Practical 3D Asset Generation: A Step-by-Step Guide

November 16, 20253 Views

Shutterstock pioneers “research license” model with Lightricks, lowering barriers to AI training data

December 13, 20243 Views
Don't Miss

Start building with Nano Banana 2 Lite and Gemini Omni Flash

July 1, 2026

Wimbledon adds IBM AI tools for live match coverage

June 30, 2026

Achieve density and score across distributions with one transformer

June 30, 2026
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2026 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?