Google is making progress toward its goal of building an AI language model that supports 1,000 different languages. The company’s Universal Speech Model (USM) is a system that supports over 100 languages and serves as the “foundation” to build an even more expansive system.
USM is described as “a family of state-of-the-art speech models” with 2 billion parameters trained on 12 million hours of speech and 28 billion sentences across over 300 languages. It also supports automatic speech recognition (ASR) and is already used by YouTube to generate closed captions.
Google aims to create a language model supporting 1,000 of the world’s most-spoken languages. The technology could have various uses, including inside augmented-reality glasses, like the concept Google showed off during its I/O event last year, able to detect and provide real-time translations. However, Google’s misrepresentation of the Arabic language during I/O shows how easy it can be to get something wrong.
Google’s USM model is a “critical first step” in realizing the company’s language model goals. Google has already announced plans to show off more than 20 products powered by artificial intelligence during its annual I/O event this year.
Meta is working on a similar AI translation tool that is still in the early stages. However, Google’s USM is already in use and supporting over 100 languages.
In addition, Google has posted a research paper that details more information about USM and how it works.