我正在使用Microsoft引擎进行语音识别。代码如下:
static ManualResetEvent _completed = null;
static void Main(string[] args)
{
_completed = new ManualResetEvent(false);
SpeechRecognitionEngine _recognizer = new SpeechRecognitionEngine();
_recognizer.LoadGrammar(new Grammar(new GrammarBuilder("test")) Name = { "testGrammar" }); // load a grammar
_recognizer.LoadGrammar(new Grammar(new GrammarBuilder("exit")) Name = { "exitGrammar" }); // load a "exit" grammar
_recognizer.SpeechRecognized += _recognizer_SpeechRecognized;
_recognizer.SetInputToDefaultAudioDevice(); // set the input of the speech recognizer to the default audio device
_recognizer.RecognizeAsync(RecognizeMode.Multiple); // recognize speech asynchronous
_completed.WaitOne(); // wait until speech recognition is completed
_recognizer.Dispose(); // dispose the speech recognition engine
}
void _recognizer_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
if (e.Result.Text == "test") // e.Result.Text contains the recognized text
{
Console.WriteLine("The test was successful!");
}
else if (e.Result.Text == "exit")
{
_completed.Set();
}
}
它看起来非常酷。当我说话时,程序可以得到" test"或"退出"。但是,我可以获得程序启动时的确切时刻以及程序完成测试并重新开始测试另一个单词吗?
答案 0 :(得分:1)
RecognitionResult.Audio
包含音频的开始时间和持续时间。
void SpeechRecognizedHandler(object sender, SpeechRecognizedEventArgs e)
{
if (e.Result == null) return;
// Add event handler code here.
// The following code illustrates some of the information available
// in the recognition result.
Console.WriteLine("Grammar({0}): {1}", e.Result.Grammar.Name, e.Result.Text);
Console.WriteLine("Audio for result:");
Console.WriteLine(" Start time: "+ e.Result.Audio.StartTime);
Console.WriteLine(" Duration: " + e.Result.Audio.Duration);
Console.WriteLine(" Format: " + e.Result.Audio.Format.EncodingFormat);
}
答案 1 :(得分:0)
SpeechRecognitionEngine有SpeechDetected
个事件。您可以使用它来确定何时识别要处理的下一个单词。
从上面链接的备注部分(强调我的):
每个语音识别器都有一个区分沉默和语音的算法。当SpeechRecognitionEngine执行语音识别操作时,当其算法将输入识别为语音时,它会引发SpeechDetected事件。相关SpeechDetectedEventArgs对象的AudioPosition属性指示识别器检测到语音的输入流中的位置。 SpeechRecognitionEngine引发SpeechDetected事件之前它引发任何SpeechHypothesized,SpeechRecognized或SpeechRecognitionRejected事件。