Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

Workplace AI Series – Part 3: Artificial Intelligence in Employment: How States Around Pennsylvania Are Near Legal Situation | Tucker Aresberg, PC

June 4, 2025

AI-Media announces innovative AI voice translation at NAB Show 2025

June 4, 2025

Gemini 2.5 native audio features

June 4, 2025
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Wednesday, June 4
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
Versa AI hub
Home»Research»AI helps researchers to discover several previously unknown molecules
Research

AI helps researchers to discover several previously unknown molecules

versatileaiBy versatileaiMay 29, 2025No Comments6 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
#image_title
Share
Facebook Twitter LinkedIn Pinterest Email

Mass spectrometry, a powerful tool for studying small molecules, has long been helping scientists unlock secrets hidden in plants, microorganisms, and even human tissues. But because of all its strength, this method has serious limitations. It’s difficult to interpret. Each time mass spectrometry analyzes a sample, it produces a complex fingerprint made of peaks and numbers. These patterns are called mass spectra. As data grew, understanding what each of them meant remains a major challenge.

That challenge is now filled with artificial intelligence.

New ways to read molecular fingerprints

A team of scientists led by Dr. Tomáš Pluskal of IOCB Prague, together with Roman Bushuyev and collaborators from the Czech Institute of Technology, created a dream-like new AI model. This system is able to reveal the structure of molecules from raw spectral data faster and more accurately than previous methods. Their work, published in Nature Biotechnology, shows great progress in solving the hidden language of natural chemistry.

From left: Dr. Tomáš Pluskal, head of biochemistry for the IOCB Prague Specialized Plant Metabolite Research Group. Roman Bushuyev, IOCB Prague. Anton Bushuiev, CIIRC CTU; Raman Samusevich, Iocb Prague; Dr. Joseph Civic, CIIRC CTU. (Credit: TomášBelloň/IOCB Prague)

Dreams were trained using a method known as self-teacher learning. Over 700 million raw mass spectra were studied from the GNPS repository containing data collected from environmental and biological samples around the world. Without being informed of what a particular spectrum means, the model learned to find patterns, similarities, and hidden features in the data.

Dr. Josefšivic, one of the researchers, compares this process with how language models like ChatGpt learn to understand text. “ChatGpt can infer the meaning of words and the connections between them from the large amount of text,” he says. “Dreams learn to recognize which molecular structures are hidden within the spectrum. It is based on data from millions of examples.”

The unknown challenges of chemistry

Despite decades of research, scientists estimate that less than 10% of naturally occurring small molecules have been discovered. In other words, most of the world’s chemical diversity remains unexplored. These unknown molecules could be key to breakthroughs in medicine, environmental safety, and even understanding life across the globe.

The main problem is not the ability to collect data, but the challenge is to analyze it. When the mass spectrometer is run, two types of data are generated. MS1 is an MS2 that zooms in on a broad overview of the molecules present and into a fragment of a particular molecule.

These MS2 spectra hold real cues to molecular identity, but only about 2% can be matched to known structures using reference libraries. Even advanced machine learning tools cannot confidently annotate more than 10% of the spectrum.

Previous tools relied heavily on limited spectral libraries or manual interpretations by experts. For example, the well-known software Sirius uses complex steps that include combinations, optimizations, and support vector machines to infer molecular fingerprints. It works well, but it still relies on hand-crafted rules and curated data, slowing things down and limiting its reach.

In contrast, dreams skip most of these steps. Learn directly from raw data without the need for human-designed shortcuts or annotated training sets. Predicts masked peaks in the spectrum and estimates when a particular chemical will appear during chromatography. Through this process, we construct a 1,024-dimensional mathematical representation of each spectrum that captures detailed information about the molecular structure.

https://www.youtube.com/watch?v=wmxfckrls5e

Chemical Universe Growth Map

One of the most memorable results of this project is Atlas’ dream. This large, interconnected network links mass spectra of over 200 million. Each spectrum is like a vast web page. Similar to the way websites are connected via hyperlinks, the spectra in dreams are connected based on chemical similarity.

Dr. Plascal explains that the network helps scientists explore links they have never noticed before. For example, dreams have found an incredible connection between pesticides, food and human skin. It even led researchers to wonder whether certain pesticides could cause autoimmune conditions like psoriasis. These types of insights have been nearly impossible to find before.

The model is not just theoretical. It already supports real-world tasks. You can guess which chemical elements are present in the molecule, the number of fragments it has, and even if it contains a particular atom, such as fluorine. This last task was particularly surprising.

Dream neural networks overcome the limitations of mass spectral libraries. (Credit: Nature Biotechnology)

“Fluorine is present in about a third of all drugs and pesticides, but previously we were unable to reliably detect it in the mass spectrum,” says Roman Bushuyev. After training dreams on millions of spectra and fine-tuned them with thousands of fluorine-containing samples, the model learned to correctly identify fluorine.

The foundation of future discoveries

Dreams represent a turning point in the use of machine learning for chemistry. Instead of relying on small datasets and slowly rule-based tools, researchers have a foundational model that can adapt to many different tasks. It works across a variety of data and experimental conditions, allowing it to be flexible enough for use in areas such as drug development, environmental science, and even searching for life across the globe.

What makes your dreams particularly exciting is the possibility of going even further. Researchers are currently working on the next step. It is to teach models to predict the perfect molecular structure. If successful, it can speed up discovery of new chemicals and allow scientists to navigate the unknown parts of the chemical world much more accurately.

Embedding of sample mean dreams allows sample-level analysis of metabolomics data, as illustrated in the Food LC-MS/MS dataset. (Credit: Nature Biotechnology)

This work also demonstrates the power of self-monitoring learning in science. By learning patterns from raw data without human labels, researchers can uncover hidden relationships and insights previously out of reach.

As Dr. Plascal points out, “The model was trained on tens of millions of spectra from diverse organisms and environments, including plants, microorganisms, food, tissues, and soil samples. This allows us to reveal hidden similarities between spectra at first glance.”

For scientists looking to better understand the components of life, dreams offer a new path forward. This is built on deep data and smarter machines, not speculation.

author avatar
versatileai
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleImagiyo makes the AI ​​art generation extremely easy and affordable
Next Article Researchers created chatbots to help teach university law classes, but AI continued to ruin
versatileai

Related Posts

Research

JMU Education Professor was awarded for AI Research

June 3, 2025
Research

Intelligent Automation, Nvidia and Enterprise AI

June 2, 2025
Research

Doudna Supercomputer to Strengthen AI and Genomics Research

May 30, 2025
Add A Comment
Leave A Reply Cancel Reply

Top Posts

New Star: Discover why 보니 is the future of AI art

February 26, 20253 Views

How to use Olympic coders locally for coding

March 21, 20252 Views

SmolVLM miniaturization – now available in 256M and 500M models!

January 23, 20252 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

New Star: Discover why 보니 is the future of AI art

February 26, 20253 Views

How to use Olympic coders locally for coding

March 21, 20252 Views

SmolVLM miniaturization – now available in 256M and 500M models!

January 23, 20252 Views
Don't Miss

Workplace AI Series – Part 3: Artificial Intelligence in Employment: How States Around Pennsylvania Are Near Legal Situation | Tucker Aresberg, PC

June 4, 2025

AI-Media announces innovative AI voice translation at NAB Show 2025

June 4, 2025

Gemini 2.5 native audio features

June 4, 2025
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2025 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?