Close Menu
Versa AI hub
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

What's Hot

Benchmarking large-scale language models for healthcare

June 8, 2025

Oracle plans to trade $400 billion Nvidia chips for AI facilities in Texas

June 8, 2025

Research papers provide a roadmap for AI advancements in Nigeria

June 7, 2025
Facebook X (Twitter) Instagram
Versa AI hubVersa AI hub
Sunday, June 8
Facebook X (Twitter) Instagram
Login
  • AI Ethics
  • AI Legislation
  • Business
  • Cybersecurity
  • Media and Entertainment
  • Content Creation
  • Art Generation
  • Research
  • Tools
Versa AI hub
Home»Tools»Catalog of genetic mutations that help pinpoint the cause of disease
Tools

Catalog of genetic mutations that help pinpoint the cause of disease

By January 12, 2025Updated:February 13, 2025No Comments8 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
Share
Facebook Twitter LinkedIn Pinterest Email

the study

Published September 19, 2023 Author

Dziga Avsek and Jun Chen

New AI tool classifies impact of 71 million ‘missense’ mutations

Unraveling the root causes of disease is one of the greatest challenges in human genetics. With millions of possible mutations and limited experimental data, which mutations can cause disease remains largely a mystery. This knowledge is essential for faster diagnosis and the development of life-saving treatments.

Today, we are publishing a catalog of ‘missense’ mutations to help researchers learn more about what impact they may have. Missense mutations are genetic mutations that can affect the function of proteins in humans. In some cases, it can cause diseases such as cystic fibrosis, sickle cell anemia, and cancer.

The AlphaMissense catalog was developed using AlphaMissense, a new AI model that classifies missense variants. A paper published in the journal Science found that 89% of all 71 million possible missense variants were classified as either likely pathogenic or likely benign. I showed it. In contrast, only 0.1% are confirmed by human experts.

AI tools that can accurately predict the impact of variants have the power to accelerate research across fields from molecular biology to clinical genetics to statistical genetics. Experiments to uncover disease-causing mutations are expensive and labor-intensive. Every protein is unique and each experiment must be designed individually, which can take months. Using AI predictions, researchers can get previews of results for thousands of proteins at once, helping them prioritize resources and accelerate more complex studies.

We have made all our predictions freely available for commercial and research use and have open sourced the AlphaMissense model code.

AlphaMissense predicted the pathogenicity of all 71 million possible missense variants. It classified 89% and predicted that 57% were likely benign and 32% were likely pathogenic.

What is a missense variant?

A missense variant is a single letter substitution in DNA that results in a different amino acid in the protein. If you think of DNA as a language, changing just one letter can change the word and completely change the meaning of the sentence. In this case, the substitution changes the amino acid that is translated, which can affect the function of the protein.

The average person has more than 9,000 missense variants. Most are benign and have little effect, while others are pathogenic and can significantly disrupt protein function. Missense variants can be used to diagnose rare genetic diseases where a small number or a single missense variant can directly cause disease. They are also important for studying complex diseases like type 2 diabetes, which are caused by a combination of many different types of genetic changes.

Classification of missense variants is an important step in understanding which changes in these proteins can cause disease. Of the more than 4 million missense mutations already identified in humans, only 2% have been annotated by experts as pathogenic or benign, leaving 71 million possible missense mutations. This corresponds to approximately 0.1% of the total. The remainder are considered “variants of unknown significance” due to a lack of experimental or clinical data regarding their effects. Using AlphaMissense, we now have the clearest picture to date by classifying 89% of variants using a threshold that yields 90% accuracy in our database of known disease variants. It has become.

Pathogenic or Benign: How AlphaMissense Classifies Variants

AlphaMissense is based on AlphaFold, a breakthrough model that predicts the structure of nearly every protein known to science from its amino acid sequence. Our fitted model can predict the pathogenicity of missense variants that alter individual amino acids in a protein.

To train AlphaMissense, we fine-tuned AlphaFold with labels that distinguish between mutations found in humans and closely related primate populations. Variants that are commonly seen are treated as benign, and variants that are never seen are treated as pathogenic. AlphaMissense does not predict changes in protein structure due to mutations or other effects on protein stability. Instead, it utilizes a database of relevant protein sequences and the structural context of the variant to generate a score between 0 and 1 that roughly assesses the likelihood that the variant is pathogenic. Continuous scoring allows users to select a threshold for classifying mutations as pathogenic or benign that meet accuracy requirements.

Diagram showing how AlphaMissense classifies human missense variants. When a missense mutation is entered, the AI ​​system scores it as potentially pathogenic or benign. AlphaMissense combines structural context and protein language modeling and is fine-tuned based on human and primate mutation population frequency databases.

AlphaMissense delivers state-of-the-art predictions across a wide range of genetic and experimental benchmarks without explicitly training on such data. Our tool outperformed other computational methods when used to classify variants from ClinVar, a public archive of data on human variant-disease relationships. We find that our model is also the most accurate method for predicting results from the laboratory and is consistent with different methods of measuring virulence.

AlphaMissense outperforms other computational methods in predicting missense variant effects.
Left: Comparison of the performance of AlphaMissense and other methods in classifying variants from the Clinvar public archive. The methods shown in gray were trained directly on ClinVar, and some of the training variants are included in this test set, so their performance on this benchmark may be overestimated.
Right: Graph comparing the performance of AlphaMissense and other methods in predicting measurements from biological experiments.

Building community resources

AlphaMissense is built on AlphaFold to advance the world’s understanding of proteins. A year ago, we announced 200 million protein structures predicted using AlphaFold. It helps millions of scientists around the world accelerate their research and pave the way for new discoveries. We look forward to seeing how AlphaMissense helps solve open questions at the heart of genomics and the biological sciences as a whole.

We have made AlphaMissense’s predictions freely available to both the commercial and scientific communities. We are working with EMBL-EBI to make it even easier to use through the Ensembl Variant Effect Predictor.

In addition to a lookup table of missense mutations, we shared expanded predictions of all 216 million possible single amino acid sequence substitutions across over 19,000 human proteins. We also included the average prediction for each gene. This is similar to measuring the evolutionary constraints of genes. This indicates how important that gene is to the survival of the organism.

Example of an AlphaMissense prediction overlaid on an AlphaFold prediction structure (red = predicted to be pathogenic, blue = predicted to be benign, gray = uncertain). Red dots represent known pathogenic missense variants and blue dots represent known benign variants from the ClinVar database.
Left: HBB protein. Variants of this protein can cause sickle cell anemia.
Right: CFTR protein. Variants of this protein can cause cystic fibrosis.

Accelerating genetic disease research

An important step in translating this research is to collaborate with the scientific community. We have been working with Genomics England to explore how these predictions can help research the genetics of rare diseases. Genomics England cross-referenced AlphaMissense’s findings with previously compiled variant virulence data on human participants. Their evaluation confirmed that our predictions were accurate and consistent, providing AlphaMissense with another real-world benchmark.

Although our predictions are not designed for direct use in the clinic and must be interpreted in conjunction with other sources of evidence, this study could improve diagnosis of rare genetic diseases and identify new disease causes. It may be useful for gene discovery.

Ultimately, the hope is that AlphaMissense, along with other tools, will help researchers better understand diseases and develop new life-saving treatments.

Learn more about AlphaMissense here.

Precautions

*As of March 13, 2024, AlphaMissense predictions are available under the CC BY v.4 license, which lifts the previous non-commercial use restriction. Please refer to the public database and Zenodo for detailed access information.

We would like to thank Juanita Bawagan, Jess Valdez, Katie McAtackney, Kathryn Seager, and Hollie Dobson for their help with the text and figures. We would also like to thank our external partners Genomics England and EMBL-EBI for their continued support. This research was made possible thanks to the contributions of co-authors: Guido Novati, Joshua Pan, Clare Bycroft, Akvilė Žemgulytė, Taylor Applebaum, Alexander Pritzel, Lai Hon Wong, Michal Zielinski, Tobias Sargeant, Rosalia G. Schneider, Andrew W. Sr., John Jumper, Demis Hassabis, Pushmeet Kohli. Also, Catherine Tunyasvnakur, Rob Fergus, Eliseo Papa, David La, Zachary Wu, Sarah Jane Dunn, Kyle R. Taylor, Natasha Latisheva, I would also like to thank Hamish Tomlinson, Augustin Zidek, Roz Onions, Mira Lutfi, John Small and Molly. Thanks to Beck, Annette Obika, Hannah Gladman, Folake Abu, Alyssa Pierce, James Tam, Q Green, Meera Last, Tharindi Hapuarachchi, and the greater Google DeepMind team for their support, assistance, and feedback. I will.

author avatar
See Full Bio
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleBitgert and Atua AI collaborate to reimagine AI-powered Web3 solutions
Next Article Top Predictions for Artificial Intelligence in 2025 | Fisher Phillips

Related Posts

Tools

Benchmarking large-scale language models for healthcare

June 8, 2025
Tools

Oracle plans to trade $400 billion Nvidia chips for AI facilities in Texas

June 8, 2025
Tools

The most comprehensive evaluation suite for GUI agents!

June 7, 2025
Add A Comment
Leave A Reply Cancel Reply

Top Posts

Deepseek’s latest AI model is a “big step back” for free speech

May 31, 20255 Views

Doudna Supercomputer to Strengthen AI and Genomics Research

May 30, 20255 Views

From California to Kentucky: Tracking the rise of state AI laws in 2025 | White & Case LLP

May 29, 20255 Views
Stay In Touch
  • YouTube
  • TikTok
  • Twitter
  • Instagram
  • Threads
Latest Reviews

Subscribe to Updates

Subscribe to our newsletter and stay updated with the latest news and exclusive offers.

Most Popular

Deepseek’s latest AI model is a “big step back” for free speech

May 31, 20255 Views

Doudna Supercomputer to Strengthen AI and Genomics Research

May 30, 20255 Views

From California to Kentucky: Tracking the rise of state AI laws in 2025 | White & Case LLP

May 29, 20255 Views
Don't Miss

Benchmarking large-scale language models for healthcare

June 8, 2025

Oracle plans to trade $400 billion Nvidia chips for AI facilities in Texas

June 8, 2025

Research papers provide a roadmap for AI advancements in Nigeria

June 7, 2025
Service Area
X (Twitter) Instagram YouTube TikTok Threads RSS
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms and Conditions
  • Disclaimer
© 2025 Versa AI Hub. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?