Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

Announces the inference capabilities of large-scale language models through complexity classes and dynamic updates

July 20, 2025

Bain & Company will form a strategic partnership with Dr Andrew NG to accelerate AI transformation for clients around the world

July 20, 2025

Diffusion expert’s segment mind mix

July 20, 2025
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Monday, July 21
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
Versa AI hub
Home»Research»Children’s shortage data in public medical images refers to an increase in age bias in biomedical AI
Research

Children’s shortage data in public medical images refers to an increase in age bias in biomedical AI

versatileaiBy versatileaiJuly 18, 2025No Comments4 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
#image_title
Share
Facebook Twitter LinkedIn Pinterest Email

Alex Lu (Opens in a new tab)Stan Hua (Opens in a new tab)Lauren Erdman (Opens in a new tab)

Artificial Intelligence (AI) is transforming healthcare using applications to detect cancer in medical images from interpretation of electronic medical records. We ask: Are the kids left behind? A 2023 survey identified that only 22 of the 692 medical AI devices were evaluated transparently in children and approved by the FDA for pediatric use. This may suggest that children are excluded from the benefits that health AI must provide. To bring awareness to this issue, the American College of Radiology recently formed a Pediatric AI Working Group to advocate for equal access to safe medical AI for children.

The problem is clear, but it remains unclear why pediatric AI is so underdeveloped. A recent print shows that “a child shortage in public medical imaging data refers to an increase in age bias in biomedical AI (Opens in a new tab)”, hypothesized that this is driven in part by child underestimation of public datasets. Modern AI relies on a large amount of data for AI development. If children’s public biomedical data is limited and they are trying to build and evaluate models focused on pediatric populations, they will face increased barriers or are initially not possible.

We attempt to answer this by conducting the largest review of public health imaging datasets to date. From medical machine learning papers, we identified various ways in which the authors of these papers identified datasets to develop machine learning models. Using the same strategy, a total of 181 public health imaging data sets were collected and analyzed for patient age reporting and distribution.

Our main finding is that, despite children making up 25% of the world’s population, less than 1% of public medical imaging data are from children. Many datasets report no age at all, suggesting that age is not considered important patient metadata by dataset creators. Even among the datasets reporting age (116 of 181), there are few attempts to balance the dataset.

Figure 1. Children are underestimated in public medical imaging datasets in all countries with available data.

Link this lack of data to some results.

First, the lack of pediatric data hinders machine learning research. Only one of the 46 studies in 2023 and 2024 used pediatric data at the Medical Imaging Conference (MIDL). Importantly, the pediatric data gap is heterogeneous across medical imaging modalities and applications. Some AI applications do not have practically any pediatric samples to build or evaluate models. For example, a review of the dataset identified almost 19,000 MRI images that could be used to construct a model that could diagnose diseases such as cancer.

table
Figure 2. A breakdown of the number of images identified in public medical image datasets by machine learning tasks and imaging modalities.

Second, in the absence of a dedicated pediatric AI model, clinicians can unconsciously rely on the use of “off-label” of adult AI models in children. Children have historically been overlooked in the development of medical medications and devices compared to adults, and therefore off-label use is generally common in pediatric clinical practice. Our research reinforces that off-label use can be dangerous when it comes to medical AI. Train AI models to predict cardiac hypertrophy, a condition characterized by an unusually large heart from chest x-ray images. It shows increasingly failing in young healthy patients, with error rates reaching 50% in the youngest children in our assessment (ages 0-1).

Third, our analysis suggests that if we do not pay attention to the issue that medical AI models may be biased towards children, this issue will only grow in future medical AI development. Recently, researchers have been training foundation models, generalist models that can handle a wide range of tasks. Training these models requires much larger datasets than previous models specialized for a particular task, so researchers build datasets for foundation models, often aggregating the datasets from multiple sources. The review identified 16 public datasets in which data arrives from other public datasets. Among the 16 datasets, eight contained secondary data that first reported the age of patients, but in all cases, age was found to be deleted at aggregation.

Together, our research reveals a significant gap in the development of medical AI. The lack of public pediatric data puts children at risk of being left behind. It features initiatives to collect, prepare and release AI-Reaided data to the public to encourage a wide range of AI to take action and to support the development of pediatric AI applications.

Opens in a new tab

author avatar
versatileai
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleAgent Economy is rebuilding AI infrastructure
Next Article Better AI Stock: CoreWeave vs. Nvidia
versatileai

Related Posts

Research

Mistral’s LE Chat challenges Openai’s corporate advantage by adding deep search agents and voice modes

July 17, 2025
Research

AI and advanced data metrics are fake a new era of cancer research

July 17, 2025
Research

Mistral AI brings deep research into Le Chat, along with image editing, audio modes and more

July 17, 2025
Add A Comment

Comments are closed.

Top Posts

New AI risk early warning system

January 27, 20255 Views

Military AI contract awarded to humanity, Openai, Google and Xai

July 15, 20251 Views

Piclumen Art V1: Next Generation AI Image Generation Model Launches for Digital Creators | Flash News Details

June 5, 20251 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

New AI risk early warning system

January 27, 20255 Views

Military AI contract awarded to humanity, Openai, Google and Xai

July 15, 20251 Views

Piclumen Art V1: Next Generation AI Image Generation Model Launches for Digital Creators | Flash News Details

June 5, 20251 Views
Don't Miss

Announces the inference capabilities of large-scale language models through complexity classes and dynamic updates

July 20, 2025

Bain & Company will form a strategic partnership with Dr Andrew NG to accelerate AI transformation for clients around the world

July 20, 2025

Diffusion expert’s segment mind mix

July 20, 2025
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2025 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?