Facebook's new automatic recognition model works with 51 languages

Facebook researchers unveiled the largest automatic speech recognition (ASR) model. She learned

understand 51 languages ​​after being taught 16thousand hours of voice recordings. In an article published on Arxiv.org, the co-authors of the work argue that a system that contains about a billion parameters improves speech recognition efficiency by up to 28.8%.

Before uploading materials, scientists shared51 languages ​​into separate groups, and then selected 10 thousand units of the dictionary as a set of information for each language group. After that, they manually combined some small language groups until they were only 6. This accelerated the process of learning the model several times.

“As far as we know, this is the first work,which studies multilingual systems on a massive scale. We got a unified speech recognition architecture for 51 languages, which does not require a lot of resources, ”said Facebook.

Researchers report that over the course of severalIn experiments, the most effective version of their model recognized words with an efficiency of 28.75%. This indicator is several times higher than that of analogues, and will improve with training.

In the article, scientists also noted that they will soon publish the second version of the system. It has become simpler and achieves the desired results in just 10 minutes. She was trained for 53 thousand hours of "raw" materials.

Read also

- It turned out that made the Mayan civilization leave their cities

- Scientists have revealed a herpes infection plan for humans: it looks like a game with bets

- On day 3 of illness, most patients with COVID-19 lose their sense of smell and often suffer from a runny nose