Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

Gemini 2.5 update from Google Deepmind

May 28, 2025

Kingsoft Cloud (KC) reports mixed results for Q1, AI Business Surges

May 28, 2025

The UK deploys AI to increase Arctic security amid growing threats

May 28, 2025
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Wednesday, May 28
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
Versa AI hub
Home»Tools»AI doctors learn to “see” medical images
Tools

AI doctors learn to “see” medical images

versatileaiBy versatileaiMay 6, 2025No Comments6 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
#image_title
Share
Facebook Twitter LinkedIn Pinterest Email

Google has given diagnostic AI the ability to understand visual medical information with its latest research on AMIE (Articulate Medical Intelligence Explorer).

Imagine chatting with AI about health concerns. Instead of simply processing your words, you can actually see the photos of that worrying rash and understand the printed ECG. That’s what Google is aiming for.

I already knew Amie had made a promise through a text-based medical chat thanks to her previous work published in Nature. But let’s face it, words aren’t the only real medicine.

Doctors rely heavily on what they can see, such as skin condition, measurements from machines. As the Google team correctly points out, even simple instant messaging platforms “enable static multimodal information (e.g. images and documents) to enrich the debate.”

Text-only AI didn’t have a huge piece of puzzle. As the researchers said, the big question was “whether LLM can conduct diagnostic clinical conversations that incorporate this more complex type of information.”

Google tells Amie what it looks and why

Google’s engineers have bolstered Amie using the Gemini 2.0 Flash model as the brain of manipulation. They combined this with what they call the “National Perception Inference Framework.” In plain English, this means that AI doesn’t just follow the script. Adapt the conversation based on what you’ve learned so far and what you still need to understand.

It’s close to how human clinicians work. Gather clues, form ideas about what’s going wrong, and narrow things down for more specific information, including visual evidence.

“This allows Amie to request relevant multimodal artifacts when needed, accurately interpret the findings, seamlessly integrate this information into ongoing dialogue, and use it to improve diagnostics,” explains Google.

Think of conversations that flow through stages. First, we gather patient history, then follow-up, heading towards diagnosis and management suggestions. AI constantly evaluates one’s understanding and seeks skin photos and lab results if they feel that knowledge gap.

Google has built a detailed simulation lab to do this right without endless trial and error on real people.

Google has derived realistic medical images and data from sources such as the PTB-XL ECG database and the Scin Dermatology Image Set, and has added plausible backstory using Gemini. Then, within this setup, Amie will “chat” with the simulated patients and automatically see how well it worked, such as by avoiding diagnostic accuracy and errors (or “hagaze”).

Virtual OSCE: Google lets Amie pass that pace

Actual tests were performed in a setup designed to reflect the way medical students evaluate: objective structured clinical tests (OSCE).

Google ran a remote study that included 105 different medical scenarios. Actual actors trained to consistently portray patients interacted with new multimodal Amie or real human primary care physicians (PCPs). These chats were made through an interface where “patients” could upload images, like modern messaging apps.

The conversation was then reviewed by a specialist doctor (dermatology, cardiology, internal medicine) and the patient actor himself.

Human physicians have acquired everything from how well history was received, the accuracy of the diagnosis, the quality of the proposed management plan, communication skills and empathy, and of course, how much AI interpreted visual information.

Surprising results of a simulated clinic

This is where it really gets interesting. In this direct comparison within a controlled research environment, Google discovered that Amie did not hold its own.

AI was rated superior to human PCP in interpreting multimodal data shared during chat. We also created a differential diagnostic list (ranking list of possible conditions) that scored high scores with diagnostic accuracy and determined to be more accurate and complete based on case details.

Specialist physicians reviewing transcripts tended to appreciate Amie’s performance in most areas. They particularly noted the “quality of image interpretation and inference,” thoroughness of diagnostic work-ups, the soundness of management plans, and the ability to flag when the situation requires urgent attention.

Perhaps one of the most surprising findings comes from the patient actor. They often found that in these text-based interactions AI is more empathetic and reliable than human physicians.

Also, in the important safety notes, in this study, no statistically significant differences were found between Amie’s frequency of making errors based on images (haptic findings) compared to human physicians.

Since the technology is not yet present, Google also ran some early tests replacing the Gemini 2.0 Flash model of the new Gemini 2.5 flash.

Using the simulation framework, results suggest further benefits, particularly by correctly diagnosing (top 3 accuracy), suggesting appropriate management plans.

I promise, but the team will soon add realism. These are automated results, and “a rigorous assessment with a professional physician review is essential to see the benefits of these performance.”

A check of important reality

Google is worthy of praise for the restrictions here. “This study explores a research-only system with OSCE-style assessments using patient stakeholders, which significantly underestimates the complexity of actual care.”

The simulated scenario, however well designed, is not the same as dealing with the unique complexity of real patients in a busy clinic. They also emphasize that the chat interface does not capture the richness of real videos or face-to-face consultations.

So, what is the next step? Move carefully towards the real world. Google has already partnered for research studies at Beth Israel Deaconess Medical Center to find out how Amie works in a real clinical setting with patient consent.

Researchers also acknowledge the need to ultimately move beyond text and static images towards real-time video and audio processing. This is a common interaction in telehealth today.

The ability to give AI the ability to “see” and interpret the types of visual evidence that doctors use every day gives us a glimpse into how AI will one day help clinicians and patients. However, from these promising findings, the path to safe and reliable tools for everyday healthcare is a long tool that requires careful navigation.

(Photo: Alexander Singh)

See: Are AI chatbots really changing the world of work?

Want to learn more about AI and big data from industry leaders? Check out the AI ​​& Big Data Expo in Amsterdam, California and London. The comprehensive event will be held in collaboration with other major events, including the Intelligent Automation Conference, Blockx, Digital Transformation Week, and Cyber ​​Security & Cloud Expo.

Check out other upcoming Enterprise Technology events and webinars with TechForge here.

author avatar
versatileai
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleThe state has passed AI laws. What do they have in common?
Next Article Lethbridge University – Urethbridge PhD student Michel Silvestle has been appointed Trudeau Foundation Scholar for groundbreaking AI and equity research
versatileai

Related Posts

Tools

Gemini 2.5 update from Google Deepmind

May 28, 2025
Tools

The UK deploys AI to increase Arctic security amid growing threats

May 28, 2025
Tools

Powerful ASR + Dialysis + Speculative decoding by endpoints of hugging hair facial inference

May 27, 2025
Add A Comment
Leave A Reply Cancel Reply

Top Posts

The UAE announces bold AI-led plans to revolutionize the law

April 22, 20253 Views

The UAE will use artificial intelligence to develop new laws

April 22, 20253 Views

New report on national security risks from weakened AI safety frameworks

April 22, 20253 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

The UAE announces bold AI-led plans to revolutionize the law

April 22, 20253 Views

The UAE will use artificial intelligence to develop new laws

April 22, 20253 Views

New report on national security risks from weakened AI safety frameworks

April 22, 20253 Views
Don't Miss

Gemini 2.5 update from Google Deepmind

May 28, 2025

Kingsoft Cloud (KC) reports mixed results for Q1, AI Business Surges

May 28, 2025

The UK deploys AI to increase Arctic security amid growing threats

May 28, 2025
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2025 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?