Speech-devices and stammering
Do you stammer/stutter? I do.
While working on the ADMINS Chat-bot project last year, we came to a significant realisation.
The speech recognizer used by default in Webchat (and many other speech user-interfaces) assume the first pause in your speech (even a small one) is the end of your utterance and they move on with the conversation. Now this may be acceptable if your responses consist of “yes“, “no“ and “continue“, or you have typical speech patterns.
However, as soon as you’re answering open-ended questions, or have a stammer/stutter or other dis-fluency, then it soon breaks down, and the Chat-bot cuts you off mid-response. I characterize that as impatient!
Our answer was to develop an alternative, adaptive speech recognizer, built on Microsoft’s cognitive speech SDK. This was configured with a longer timeout for the open-ended questions in our Chat-bot’s dialog, while recognizing and responding more quickly to the shorter closed responses like “yes“, “no“ and “move on“.
After some experimentation, we settled on a timeout of 1.75 seconds (a compromise between everyone’s needs), and the adaptive speech recognizer was used successfully during the main trial for ADMINS. Always at the back of my mind was the question — could the speech recognizer adapt to the speaker, not just to the conversation?
Since then, I’ve realised that most speech-devices, including smart speakers like Alexa, voice assistants like Siri and automated phone-systems are not well suited to those with a stammer/stutter or other non-typical speech.
What I sense is needed, is data on the attitudes of people who have dis-fluencies to speech-devices, and research into the effectiveness of potential adaptations. This could be used to drive change, potentially including through standards like the Web Content Accessibility Guidelines.
I’m working to put together a research project. I effectively mocked up a short survey on twitter, with interesting results!
Luckily I’m not the only one working in this space! Project Euphonia and Project Understood are important and ambitious projects to effect change. Researchers including Leigh Clark are doing valuable research.
If you’re interested in collaborating, please reach out to me on twitter or via The Open University.
See research papers on ORO, Microsoft and OU research news.
'SM' comments disabled.
'ID' comments disabled.