165 new cancer genes identified using machine learning

Previously unknown genes

A new algorithm can predict which genes will cause cancer, even if

the DNA sequence has not changed.A team of researchers in Berlin combined various data and analyzed it using "artificial intelligence" to identify a large number of oncogenes. This opens up new perspectives in the development of targeted cancer treatments and biomarkers in personalized medicine.

In cancer, cells multiply and invade tissues,destroying organs and thereby disrupting their vital functions. Unlimited growth is usually caused by the accumulation of DNA changes in oncogenes, mutations in these genes that control cell development. However, some cancers have very few mutated genes. This means that other causes lead to a dangerous disease.

A group of researchers from the Institute of MolecularMax Planck Genetics Institute (MPIMG) in Berlin and the Institute for Computational Biology in Helmholtz, uses machine learning methods to identify 165 previously unknown cancer genes. Researchers use a special algorithm to analyze the data.

The sequence of these genes is optionalchange. It is obvious that a violation of their regulation can already lead to cancer. All recently identified genes interact closely with well-known oncogenes. They are necessary for the survival of tumor cells, have shown experiments in cell cultures have shown.

Additional goals for personalized medicine

An algorithm called EMOGI in ExplainableMulti-Omics Graph Integration can also explain the relationship between cellular mechanisms that turn a gene into an oncogene. As a group of researchers led by Annalisa Marsico explains in the journal Nature Machine Intelligence, the software integrates tens of thousands of datasets created from patient samples. This includes information on DNA methylation, the activity of individual genes and protein interactions within the cellular pathway, as well as data on sequences with mutations. In this data, deep learning algorithms discover the patterns and molecular principles that lead to the development of cancer.

Unlike traditional cancer treatments,such as chemotherapy, individual treatments are tailored to the specific type of tumor. “Our goal is to choose the best treatment for each patient, the most effective treatment with the least side effects. In addition, molecular properties can be used to detect cancers that are already in their early stages, ”explains Marsico, head of research group MPIMG.

“Only by knowing the cause of the disease can we effectively counteract or correct it,” the researchers write. "This is why it is so important to identify as many cancer-causing mechanisms as possible."

Better results with a combination

“To date, most studiesfocused on pathogenic changes in sequence or cellular patterns, '' said Roman Schulte-Sasse, a doctoral student on Marsico's team and first author of the publication. "At the same time, it has recently become clear that epigenetic disturbances or dysregulation of gene activity can also lead to cancer."

This is why researchers have pooled the data.Sequences representing circuit failures, with information representing events in cells. Scientists initially confirmed that mutations or proliferation of genomic segments are actually the main cause of cancer. Then, in a second step, we identified candidate genes that are not very directly related to the genes that actually cause cancer.

“For example, we found a gene in cancer thatlittle change in sequence, but it regulates the supply of energy and is necessary for tumors, says Schulte-Zass. “These genes cannot be controlled in any other way. For example, it is caused by chemical changes in DNA, such as methylation. These changes do not affect sequence information, but dominate gene activity. Such genes are promising targets for drug discovery, but because they work in the background, they can only be found using sophisticated algorithms. ”

Further research

The new researchers program adds a lotnew entries to the list of suspicious oncogenes. In recent years alone, it has grown from 700 to 1,000. Researchers have only tracked hidden genes using a combination of bioinformatics analysis and modern artificial intelligence (AI) techniques.

There are many more interesting details hidden in the data.“We are seeing a lot of cancer patterns,” says Marsico. "I think this is evidence that tumors are caused by different molecular mechanisms in different organs."

Researchers emphasize that the EMOGI programis not limited to cancer. In theory, it can be used to integrate different sets of biological data and find patterns. The algorithms are applicable to similar complex diseases.

Read more

The first accurate map of the world was created. What's wrong with everyone else?

Infrared radiation from human hands was used for encryption

Uranus has received the status of the strangest planet in the solar system. Why?