Jump to content

Universal Speech Model: Difference between revisions

no edit summary
No edit summary
No edit summary
 
Line 1: Line 1:
{{see also|Papers}}
{{see also|Papers}}
==Universal Speech Model==
==Universal Speech Model==
The [[Universal Speech Model]] (USM) is a state-of-the-art collection of speech models with 2 billion parameters, engineered to conduct automatic speech recognition (ASR) in over 300 languages. USM has been trained using 12 million hours of spoken data and 28 billion text sentences. Developed for uses such as subtitles on [[YouTube]], the system supports widely-used languages like English and Mandarin, as well as less common languages, encompassing Punjabi, Assamese, Santhali, Balinese, Shona, Malagasy, Luganda, Luo, Bambara, Soga, Maninka, Xhosa, Akan, Lingala, Chichewa, Nkore, and Nzema.
The [[Universal Speech Model]] ([[USM]]) is a state-of-the-art collection of speech models with 2 billion parameters, engineered to conduct automatic speech recognition (ASR) in over 300 languages. USM has been trained using 12 million hours of spoken data and 28 billion text sentences. Developed for uses such as subtitles on [[YouTube]], the system supports widely-used languages like English and Mandarin, as well as less common languages, encompassing Punjabi, Assamese, Santhali, Balinese, Shona, Malagasy, Luganda, Luo, Bambara, Soga, Maninka, Xhosa, Akan, Lingala, Chichewa, Nkore, and Nzema.


==Model Development and Efficacy==
==Model Development and Efficacy==
370

edits