Speech Learning Model

Fine-tuning a Strong Language model to Enable Classroom Speech Recognition

Postdoctorate Viet Anh Trinh led a project within Strand 1 to develop a novel neural network architecture that can both recognize and generate speech. He has since moved on from iSAT to a role at ...

InfoQ

Google AI Updates Universal Speech Model to Scale Automatic Speech Recognition beyond 100 Languages

A monthly overview of things you need to know as an architect or aspiring architect. Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with ...

HUB

Model moves computers closer to understanding human conversation

An engineer from the Johns Hopkins Center for Language and Speech Processing has developed a machine learning model that can distinguish functions of speech in transcripts of dialogues outputted by ...

eLife

High-Fidelity Neural Speech Reconstruction through an Efficient Acoustic-Linguistic Dual-Pathway Framework

This study presents a valuable advance in reconstructing naturalistic speech from intracranial ECoG data using a dual-pathway model. The evidence supporting the claims of the authors is solid, ...

TechCrunch

Largest text-to-speech AI model yet shows ’emergent abilities’

Researchers at Amazon have trained the largest ever text-to-speech model yet, which they claim exhibits “emergent” qualities improving its ability to speak even complex sentences naturally. The ...

EurekAlert!

Researchers propose new and more effective model for automatic speech recognition

Popular voice assistants like Siri and Amazon Alexa have introduced automatic speech recognition (ASR) to the wider public. Though decades in the making, ASR models struggle with consistency and ...

The Verge

Meta releases multilingual speech translation model

It’s like Babel Fish but not in your ear. It’s like Babel Fish but not in your ear. is a reporter who writes about AI. She also covers the intersection between technology, finance, and the economy.

WinBuzzer

Google Tranlsate Unlocks Gemini AI Live Speech Translations for All Android Users

Google has unlocked Live Translate for all Android headphones using Gemini 2.5 and has added daily streaks to challenge ...

Geeky Gadgets

ChatTTS a new open source AI voice text-to-speech AI model

ChatTTS is an open-source AI voice text-to-speech (TTS) model that has gained significant popularity on GitHub due to its impressive features and user-friendly design. This model is specifically ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results