Thread: For readers
View Single Post
  #14 (permalink)  
Old 05-29-2005, 12:16 PM
Jonathan Sachs
Tablet PC Guest
 
Posts: n/a
Re: For readers

On Sun, 29 May 2005 10:36:39 -0700, "Alan" wrote:

> It sounds like my ideal device needs to be highly constrained as far
>as the speech recoginition goes. Thinking out loud, suppose I told it
>I mostly read science and science-fiction and this reduced the
>books-in-print
>universe to say 20,000 items (an item being a title and author, with of
>course more
>unique titles than authors). Could a speech recognition system learn a
>universe of
>words of that size, so that it could recognize my pronunciation,
>by hearing another computer program pronounce them all?
>
>I am just trying to imagine the best way to train this system, if that is
>possible
>at all, with minimal input from the user. How would you do it?


First, a caveat: I am not a technical expert in this area. I have a
lot of technical background, and I have relied on speech recognition
software for most of my computer use for about three years, because
carpal tunnel syndrome limits my ability to use a keyboard. In the
process I have picked up a good deal of technical lore, but when you
ask very specific questions, I'm not qualified to answer.

That said, a vocabulary of 20,000 items is certainly within the range
traditionally considered to be limited vocabulary. That's a big help.
It's also a big help if the system may require training to recognize
each user's speech. (I have been assuming otherwise, but I can't offer
cogent reasons for that.) Whether the system you envision is
technically feasible or not, or will become so in the foreseeable
future... I'm not even qualified to guess.

As a user of Dragon NaturallySpeaking, I'm accustomed to thinking of a
"vocabulary item" as a single word, or at most a two- or three-word
phrase. To give an example in your proposed subject area, the
vocabulary would contain "Kim," "Stanley," and "Robinson," and
hopefully the word-sequence frequency data would show that "Kim
Stanley Robinson" is a common sequence of words. The user might add
"Kim Stanley Robinson" as a distinct vocabulary item, improving
recognition for that name, but would never add "Years of Rice and
Salt, by Kim Stanley Robinson" as a single item. I don't know whether
current speech recognition systems would respond well to such a long
entry, or could be adapted to do so, or _should_ be adapted to do so.

One more caution: while it is easy for users to add words or phrases
or lists of them to a vocabulary, it takes a great deal of labor to
build a new vocabulary and the word-sequence frequency tables and
other arcana that accompany it. The person who does the work must be
trained in computational linguistics _and_ must be thoroughly familiar
with the internals of the speech recognition product in question. For
any given product, there may be a dozen qualified vocabulary builders
in the world! That is part of the reason why the medical and legal
versions of Dragon NaturallySpeaking each sell for a premium of about
$500 over the standard "Professional" version.

This poses a problem for the type of specialized vocabulary you are
contemplating: it may simply be unaffordable to create specialized
vocabularies for any but a small number of highly lucrative markets.

My email address is LLM041103 at earthlink dot net.
Reply With Quote

 
Old 05-29-2005, 12:16 PM
Xploder HD Movie Player for PS3. Manage, convert and transfer media files between the PC and PS3.