Simple-speak, voiceFamily

I’ve been continuing with the burst of activity that led to the gaad-widget, and a re-booted sign-machine. (In the case of sign-machine, it was desirable to produce a quick prototype, then judge the reaction, if any.)

Ever since I worked for a speech technology company (Aurix), then discovered screen readers, I’ve been interested in and fascinated by speech synthesisers.

When I wrote a book on Moodle, one of the plugins I developed enabled speech synthesis in Moodle. Unfortunately, the web services that it relies on seem to no longer be avilable, and it has languished.

I’m happy to say that the Web has moved on since then. With the evolution of HTML5, we now have the Web Speech API, implemented in most modern browsers. So, it’s the ideal time to develop an easy to integrate Javascript wrapper around the API — enter simple-speak.

Speech input

It can be embedded on a page with 3 lines of HTML — including jQuery (thanks to unpkg and RawGit):

<div id="id-simple-speak"> Hello. I'm simple-speak. </div>

<script src="https://unpkg.com/[email protected]/dist/jquery.min.js"></script>
<script src="https://unpkg.com/[email protected]#._.js"></script>

The synthesiser can also be embedded via an <iframe>, and depending on the capabilities of the browser, can speak in different languages. (I haven’t yet started to internationalize the user-interface.) Here’s an example in Spanish:

And, the source code for the above example:

<iframe class="simple-speak-ifr" width="100%" height="75" src=
  "https://cdn.rawgit.com/nfreear/simple-speak/1.3.0-beta/embed/?lang=es-ES;q=Buenos%20d%C3%ADas.%20%C2%BFc%C3%B3mo%20est%C3%A1s%3F"
></iframe>

Here’s a Mandarin Chinese synthesiser, in an <iframe>:

voiceFamily

A useful feature of simple-speak is the voiceFamily configuration property. This property behaves a bit like font-family in CSS, allowing an ordered, comma-separated cross-platform list of synthesis voices to be listed.

In the following example, the synthesiser uses a Google voice on the Chrome browser, falls back to a Microsoft voice on various browsers on Windows, then falls back to a generic female voice if the first two options aren’t available.

<div data-simple-speak=
  '{ "voiceFamily": "Google UK English Female, Microsoft Anna - English (US), female" }'
></div>

So, in the same way as a designer lists a range of fonts available across different platforms, you can list your preferred and fall-back voices.

body {
  font-family: Helmet, Freesans, Helvetica, Arial, sans-serif;  /* CSS style: BBC */
}

View and contribute to this spreadsheet of available voices (get-voices).

Use cases?

I’d love your ideas for how to harness the power of speech. Some possible use cases for simple-speak:

As a component of online learning tools. For example, a re-boot of the Moodle SimpleSpeak filter plugin,
A site widget (a bit like Google’s translate your site service),
A browser extension,
An updated synthesiser for an ‘in-browser’ screen-reader, such as WebAnywhere. (GitHub) (Currently the screen-reader appears to be broken — a real pity.)

simple-speak is available via the NPM registry and GitHub.

Go play! And, please feedback via the comments, to @nfreear on Twitter, and on Facebook.

Update

Spell me

15 June 2017 ~ I’m working on a spelling mode …

1 July 2017 ~ version 1.3.0 (beta) is released. It features a spelling mode, and triggers a custom message each time it starts to speak a phrase:

Here’s the HTML <iframe> source for the above spelling widget (mode=spell):

<iframe class="simple-speak-ifr" aria-label="Speech synthesis" src=
  "https://cdn.rawgit.com/nfreear/simple-speak/1.3.0-beta/embed/?mode=spell;rate=0.9&q=Spell%20me!"
></iframe>