New AI Model Predicts Mutation Effects in DNA’s Dark Regions

New AI Model Predicts Mutation Effects in DNA’s Dark Regions

Health

After AlphaFold, the Artificial Intelligence model capable of deciphering the Three-D structure of proteins that won John Jumper, David Baker and Damis Hassabis the Nobel Prize in Chemistry for the year 2024, the Google DeepMind company introduces a new artificial intelligence model called AlphaGenome, CE Report quotes Kosova Press.

It is designed to read the so-called dark matter of DNA - the group of genetic sequences that do not code for proteins, but that affect their activity, KosovaPress reports.

Long mistakenly labeled as useless DNA, these mysterious sequences make up the vast majority of human DNA, up to ninety-eight percent.

The AlphaGenome model is described in a paper that has not yet been reviewed by the scientific community.

"The coding part of our genome, made up of around twenty thousand genes, is now well known," Giuseppe Novelli, a geneticist at Tor Vergata University in Rome, told ANSA.

"The other part, on the other hand, is extremely heterogeneous: one part consists of repetitive DNA, another consists of moving elements that can change their position. However, these are still genes, estimated at six to sixty-three thousand, which, however, code for single-stranded RNA molecules that bind to DNA. Given their large quantity it is very important to have a tool like this, which can at least indicate which family they belong to," said Novelli.

AlphaGenome can read long RNA sequences, up to a million letters, and make thousands of predictions about their role and the potential effects of each mutation.

In one example, researchers led by Zhiga Avsec tested the model with mutations identified in people with leukemia, and AlphaGenome was able to accurately predict that the mutations would indirectly activate a nearby gene that is considered one of the most common causes of this type of cancer.

However, artificial intelligence is still limited because it has only been trained with data from humans and mice, and it has difficulties if mutations change genes located far apart.

Tags

Related articles