Speech Recognition

Dobai-Pataky_BSSSSl · November 2, 2010, 9:51pm

Where would I start if I wanted to create some small programs/scripts
that could recognize a few verbal commands from me and respond
accordingly? I’m not thinking of anything real sophisticated here, just
a few commands, and it only has to recognize one voice (mine).

jnb · November 2, 2010, 10:19pm

On Tue, Nov 2, 2010 at 9:51 PM, Jonathan B. [email protected]
wrote:

Where would I start if I wanted to create some small programs/scripts
that could recognize a few verbal commands from me and respond
accordingly? I’m not thinking of anything real sophisticated here, just
a few commands, and it only has to recognize one voice (mine).

By buying Dragon Naturally Speaking and scripting that (via COM, if
you are on Windows, but I guess any other IPC that DNS understands
should work). Alternatively: any voice recognition system that ships
as part with your OS of choice (Windows Vista has one, Linux should
have something available, somewhere, but I don’t know about Mac OS X).

Yes, it’s overkill, but so is using voice activated scripts / tools.

On a more serious note: voice recognition is hard. If you ever
encountered a speech recognition system when using a hotline, I’m sure
you are fully aware of how difficult it is to pull off. And those folk
can buy all the CPU power and software they need.

TBH, without a degree in Linguistics and comp. sci. I wouldn’t even
/try/ to roll my own solution for this.

–
Phillip G.

Though the folk I have met,
(Ah, how soon!) they forget
When I’ve moved on to some other place,
There may be one or two,
When I’ve played and passed through,
Who’ll remember my song or my face.

jnb · November 3, 2010, 5:42pm

Let me be clear: I /don’t/ want to create my own voice recognition
software. (Holy Waveform, Batman!) I just wanted to tie some existing
software into a script that could launch a few of my favorite commands
and applications. E.g., “Computer, launch Firefox”.

jnb · November 3, 2010, 3:10pm

Corrected javadoc url for sphinx4 :

http://cmusphinx.sourceforge.net/sphinx4/javadoc
Sent wirelessly via BlackBerry from T-Mobile.

jnb · November 3, 2010, 6:03pm

Jonathan B. wrote in post #959050:

Let me be clear: I /don’t/ want to create my own voice recognition
software. (Holy Waveform, Batman!) I just wanted to tie some existing
software into a script that could launch a few of my favorite commands
and applications. E.g., “Computer, launch Firefox”.

Oh, and I’m running Gentoo Linux, so a Linux-only solution is fine.

jnb · November 4, 2010, 12:17am

The most seamless solution would be Sphinx4 with JRuby. But I would
recommend using pocketsphinx and calling it via command line, I found
that it performs better and is much easier to use. Just use one of the
acoustic models that come with pocketsphinx, write a JSGF grammar,
generate a dictionary and you’re good to go. As long as you have a
limited grammar and a reasonably clear audio signal it’s no rocket
science to build something that works well.

jnb · November 19, 2010, 7:45am

On Wed, Nov 3, 2010 at 6:17 PM, Andreas S. [email protected]
wrote:

The most seamless solution would be Sphinx4 with JRuby. But I would
recommend using pocketsphinx and calling it via command line, I found
that it performs better and is much easier to use. Just use one of the
acoustic models that come with pocketsphinx, write a JSGF grammar,
generate a dictionary and you’re good to go. As long as you have a
limited grammar and a reasonably clear audio signal it’s no rocket
science to build something that works well.

Performs better at command line instead of direct invocation from
JRuby? If so, please prove it and file a perf bug

Charlie

jnb · June 11, 2011, 1:57pm

Jonathan B. wrote in post #958821:

Where would I start if I wanted to create some small programs/scripts
that could recognize a few verbal commands from me and respond
accordingly? I’m not thinking of anything real sophisticated here, just
a few commands, and it only has to recognize one voice (mine).

Hi

Have you tried out the suggested solutions.

Pls can you let me know which worked out.

ashley999 · September 3, 2021, 10:26am

Try to follow the example of similar services. My favorite is sound to text converter. It is really accurate and provides high-quality automatic transcription services by AI.