The program has found all 200 million proteins known to science: how is this possible

The researchers compiled a database of 200 million protein structures. They achieved this with the AlphaFold program,

which DeepMind developed in 2018 andreleased in July 2021. The open source program predicts the 3D structure of a protein based on its amino acid sequence, the building blocks that make up proteins. The structure of a protein dictates its function, so the database identified by AlphaFold will help identify new working protein functions that humans can use.

Paradox proteins

Proteins are the building blocks of life.They are produced by a variety of organisms, from bacteria to plants and animals, and when they are formed, they add up in milliseconds. Formed from chains of amino acids folded into complex shapes, their three-dimensional structure largely determines their function. Once you figure out how a protein folds, you can understand how it works and change its behavior.

Although DNA provides instructions for makingchains of amino acids, predicting how they interact to form a three-dimensional shape was very difficult. Until recently, scientists have deciphered only a fraction of the 200 million proteins known to science. The problem is that their structure is so complex that trying to guess what form they will take is almost impossible.

AlphaFold by DeepMind has created 3D images of protein structures. Image courtesy of DeepMind

Cyrus Levinthal, American Molecularbiologist, wrote in a 1969 paper about the paradox: despite the huge number of possible configurations, proteins fold quickly and accurately. Moreover, each protein can take from 10^300 possible final forms.

Thus, wrote Levinthal, if one were to try to find the correct form of a protein by trying each configuration one by one, it would take longer than the universe exists.

Attempts of scientists

Scientists have ways to visualize proteins andanalyze their structure, but this is too slow and difficult work. According to the journal Nature, X-ray crystallography is most commonly used to image proteins. In this method, X-rays are directed at solid crystals of proteins and the way they are refracted is measured. The goal is to determine how a protein is structured. According to DeepMind, this experimental work has established the shape of about 190,000 proteins.

New method

In November 2020, the DeepMind group dedicated toAI announced the development of a program called AlphaFold that can quickly predict this information using an algorithm. Since then, he has been studying the genetic codes of every organism whose genome has been sequenced and predicting the structures of the hundreds of millions of proteins they contain together.

AlphaFold works by accumulating knowledge aboutamino acid sequences and interactions, trying to interpret protein structures. As a result, the algorithm learned to predict the shape of proteins in a matter of minutes with an accuracy to the level of atoms.

Last year, DeepMind published in the opena protein structure database of 20 species, including nearly all 20,000 human-expressed proteins. He has now completed the work and released the predicted structures for over 200 million proteins.

How is the technology applied?

Researchers are already using the fruits of laborAlphaFold. According to The Guardian, the program allowed scientists to definitively characterize a key protein in the malaria parasite that defied X-ray crystallography. This will eventually improve the vaccine against the disease.

3D image of malaria protein. Image courtesy of Deepmind

Honey bee researcher Vilde Leipart fromThe Norwegian University of Life Sciences used AlphaFold to reveal the structure of vitellogenin. It is a reproductive and immune protein that is produced by all oviparous animals. The discovery will help develop new ways to protect, for example, honey bees and fish from disease. This is important, because these animals are important for the sustenance of mankind.

The program also informs about the search for newpharmaceuticals, Rosana Capeller, CEO of ROME Therapeutics, said in a statement to DeepMind. “The speed and accuracy of AlphaFold is accelerating the drug development process. We are just beginning to realize its impact on the development of pharmaceuticals,” she concluded.

Also AlphaFold models are also used by scientistsfrom the Center for Enzyme Innovation at the University of Portsmouth to identify enzymes from the natural world that can be customized for plastics processing.

Read more:

Soon a solar storm will hit the Earth: the material flies at a speed of 800 km / s

Scientists filmed a strange creature with tentacles, which they mistook for a flower

Russia leaves the ISS: what will happen now and why the maintenance of the station is under threat