Meta takes a step forward in the search for the preservation of language and the diffusion of text-to-speech and speech-to-text technology. Mark Zuckerberg’s company created MMS, an artificial intelligence that serves to identify more than 4 thousand spoken languages.
Massively Multilingual Speech, Given its original name, it is a tool to “make it easier for people to access information and use devices in your preferred language as Meta explains.
This model extends text-to-speech and speech-to-text technology from around 100 languages to more than 1,100, in addition to identifying more than 4 thousand spoken languages.
“We are opening up our models and code so that others in the research community can develop our work and help preserve the languages of the planet, bringing the world closer”, emphasizes Mark Zuckerberg’s company.
Meta used the Bible and other religious texts to fuel Artificial Intelligence
There is a religious basis to the work of Zuckerberg’s company. Meta used texts like the Bible to fuel Artificial Intelligence.
Meta collected a data set of readings from the New Testament in more than 1,100 languages, providing an average of 32 hours of data per language.
“When considering unlabeled recordings of other Christian religious readings, we increased the number of languages available to more than 4 thousand”, Meta explains. MMS works with both male and female voices without any problem.
Researchers at Meta’s AI lab worked with an algorithm designed to align audio recordings with accompanying text, explains MIT Technology Review.
They then repeated the process with a second algorithm, trained on the newly aligned data. Finally, MMS more easily learned a new language, no need for text.
“We can use what the model learned to quickly build voice systems with very, very little data.” says Michael Auli, a research scientist at Meta. “For English, we have lots and lots of good data sets, and we have them for a few other languages, but we just don’t have that for languages that are spoken by, say, a thousand people.”
For the moments, MMS still runs the risk of incorrectly transcribing certain words or phrases. Additionally, its speech recognition models generated more biased words than other models, up 0.7%, according to MIT Technology Review.