AI taught to translate code from one language to another

Scientists explained that artificial intelligence systems and machine learning in recent years have become

more and more intelligent and capable of not only understandingtext, but also write it. However, they still have little command of programming languages. To remedy this, IBM announced at Think 2021 that its researchers had created an AI to translate code, the IBM CodeNet project.

“We need our own ImageNet, which canexplore innovative ideas and reflect them in various algorithms, the researchers noted. - CodeNet is, in fact, ImageNet for computers. It is a massive dataset for teaching AI / ML systems to translate code, consisting of 14 million chunks and 500 million lines in over 55 legacy and active languages ​​- from COBOL and FORTRAN to Java, C ++ and Python. "

They explained that the dataset is built like thisin a way that it allows bidirectional translation. That is, the user can take outdated code, which is often used in banking and government, and translate it into Java and another language.

Newspeak, Inter-Slavic and Esperanto: How Science Artificially Creates Languages

Scientists explained that the dataset consists ofmany kinds of programming competitions and all sorts of problems - some more advanced, some more academic. Moreover, these languages ​​have been used over the past decade and a half in many of these competitions, the solutions of which were represented by thousands of students.

Users can also run individualchunks of code "for extracting metadata and checking the results of generative AI models for correctness." This will allow researchers to program equivalent pieces of code when translating one programming language to another.

The scientists added that their development is important forautomated reasoning and decision making, with the ability to explain these decisions. In fact, this is the same branch of model development as computer vision and natural language processing.

