Before the Hephaestus Project was founded, Dr. Feigenblatt was interested in the inevitable emergence of text-to-speech (TTS) personal digital assistants. (He was always an ardent promoter of E-books.) Here is a brief recounting of this interest and useful information about the state-of-the-art today (April 2002).
The rapid adoption of PDAs following the introduction of the Palm Pilot in 1996 at last offered palpable hope that text-based E-books could be rendered as speech output on rugged portable devices far smaller than a laptop computer.
At the website of Audible, Inc. today we read that the firm is...
"a leading provider of Internet-delivered premium spoken audio content for playback on personal computers and mobile devices... Audible's business is geared toward the estimated 84 million people who drive alone to work each day, the 40 million mobile workers who spend 20 percent of their time on the road, the 50 million people who exercise regularly, and other consumers who are seeking enrichment when their eyes are busy but their minds are free."
This helps place the following e-mail dialog with Audible in context:
Date: Mon, 25 May 1998 15:02:03 -0500
To: partners@audible.com
From: R I Feigenblatt
Subject: new opportunityI always believed that a gadget like yours would be of inestimable
value. I wrote an article which appeared the year you were founded
(1995) in the September issue of:"Atlanta Computer Currents/Technology South"
"Super Highway Information: Newmedia hits the road"by R. I. Feigenblatt, Ph.D.
in which I offered that:
"Perhaps one day not so far off you'll tell your car
to download the latest best-seller and read it aloud
to you in the celebrity voice-font of your choosing,
as you'd rather not listen to your office email quite yet."I have looked over your Web site, but I cannot seem to find
description of a feature I - and millions of others - would
want - an appliance into which we could load ALPHANUMERIC
information and have it rendered as AUDIO during that long,
boring commute, etc.
Of course, this could be done in at least two ways.
1. The text to audio conversion could be done on a PC
2. The text to audio conversion could be done with the appliance
(this is the more COMPRESSED file format)
I imagine you would adopt solution (1.) to keep compatibility
with your existing player. Basically, what I am advocating
is a consumer-friendly "publishing kit" that includes
a decent text-to-speech engine. I am sure that you can
find partners eager to work with you on this.
Do you have plans in this direction? Might I be of any
assistance? Do please reply. By the way, if and when you
implement the suggestion I make I shall run right out
and buy your product!
Date: Wed, 27 May 1998 10:20:29 -0400
To: R I Feigenblatt
From: Jonathan Korzen
Subject: Re: new opportunity
Hi Dr. Feigenblatt,
Many thanks for your kind email and interest in Audible Inc. As you suspect
the text to speech capability is something we're considering as a feature
in our next generation MobilePlayer. I [think] it's something which will
become part of our MobileAudio System by the end of the year.
Yours,
Jonathan
It would seem that Audible never took up the opportunity to exploit TTS technology. However, rapid advances in non-volatile solid-state memories make all sorts of storage-centric PDA functions possible these days, and text-based E-books are even easier than MP3 audio to support on account of compactness.
One reference design the Hephaestus Project did not publish suggested the following scheme. Text is rendered as speech using any of the many TTS programs bundled with sound cards, or by means of free TTS engines, such as available from Microsoft. The speech can be recorded to an analog audio cassette recorder for later playback. Alternately, a program like Total Recorder is used to capture the TTS audio produced and saved as a sound file. That file is then downloaded to a PDA supporting audio output. By 1999 such PDAs had codecs (e.g. HP Mobile Voice) that rendered understandable speech at data rates as compact as 300 bytes per second and were sold for as little as $100 at remainder discount. PDA choices also included the various MP3 appliances that emerged in 1999. By 2002, while the latest high-end units included tiny multi-gigabyte magnetic drives that obviated the need to compromise audio recording quality, low-end TTS software output still sounded artificial in its character, rather than human, all the same. [Note added May 2002: The use of MP3 players with PC-based TTS software is now discussed at length in "The New York Times" here. If you have trouble registering (for free) to read the article, the firms whose products are mentioned include Fonix, Premier Programming, and NextUp Technologies.]
The reason our design was never published is that a better alternative soon became commercially available.
It would seem that by 2000 one could buy a marvelous compact device that could download ASCII text from a personal computer and then render it via TTS: A firm called Companion Devices, design successor to Ostrich Software, makes available its Road Runner product at the newly reduced suggested retail price of US$250. During the tenure of its sale by Ostrich at US$349, it earned the favorable evaluation of the National Federation of the Blind. No small part of the appeal of the Road Runner is its specialized user interface, optimized for the purpose of controlling an E-book. In combination with an integrated TTS engine, one obtained a vastly superior alternative to the unpublished reference design discussed.
[More than two years after this flattering assessment was first offered here in February 2000, Dr. Feigenblatt was pleased to receive a sample unit of this product, gratis. While this gift did not influence the original assessment, obviously such a consideration ethically compromises the impartiality of any future statements! Manufacturers are advised that design excellence alone will earn them our accolades, while gratuities will only disqualify them for any future objective evaluation on this Web site!]