All public logs

Combined display of all available logs of AI Wiki. You can narrow down the view by selecting a log type, the username (case-sensitive), or the affected page (also case-sensitive).

16:56, 2 March 2023 Nicoboomer talk contribs created page Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers (VALL-E) (Created page with "{{see also|Papers}} ==Introduction== In the last decade, there have been significant advances in speech synthesis via neural networks and end to end modeling. Current text-to-speech (TTS), systems require high-quality data from recording studios. They also suffer from poor generalization for unseen speaker in zero-shot situations. A new TTS framework, VALL-E, has been developed to address this issue. It uses audio codec codes for an intermediate representation as well a...")