使用微软语音识别,我可以获得它开始和结束的时刻吗?

时间:2014-07-18 18:54:55

标签: speech-recognition

我正在使用Microsoft引擎进行语音识别。代码如下:

static ManualResetEvent _completed = null;
static void Main(string[] args)
{
     _completed = new ManualResetEvent(false);
     SpeechRecognitionEngine _recognizer = new SpeechRecognitionEngine();
     _recognizer.LoadGrammar(new Grammar(new GrammarBuilder("test")) Name = { "testGrammar" }); // load a grammar
     _recognizer.LoadGrammar(new Grammar(new GrammarBuilder("exit")) Name = { "exitGrammar" }); // load a "exit" grammar
     _recognizer.SpeechRecognized += _recognizer_SpeechRecognized; 
     _recognizer.SetInputToDefaultAudioDevice(); // set the input of the speech recognizer to the default audio device
     _recognizer.RecognizeAsync(RecognizeMode.Multiple); // recognize speech asynchronous
     _completed.WaitOne(); // wait until speech recognition is completed
     _recognizer.Dispose(); // dispose the speech recognition engine
} 
void _recognizer_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
     if (e.Result.Text == "test") // e.Result.Text contains the recognized text
     {
         Console.WriteLine("The test was successful!");
     } 
     else if (e.Result.Text == "exit")
     {
         _completed.Set();
     }
}

它看起来非常酷。当我说话时,程序可以得到" test"或"退出"。但是,我可以获得程序启动时的确切时刻以及程序完成测试并重新开始测试另一个单词吗?

2 个答案:

答案 0 :(得分:1)

RecognitionResult.Audio包含音频的开始时间和持续时间。

void SpeechRecognizedHandler(object sender, SpeechRecognizedEventArgs e)
{
  if (e.Result == null) return;

  // Add event handler code here.

  // The following code illustrates some of the information available
  // in the recognition result.
      Console.WriteLine("Grammar({0}): {1}", e.Result.Grammar.Name, e.Result.Text);
      Console.WriteLine("Audio for result:");
      Console.WriteLine("  Start time: "+ e.Result.Audio.StartTime);
      Console.WriteLine("  Duration: " + e.Result.Audio.Duration);
      Console.WriteLine("  Format: " + e.Result.Audio.Format.EncodingFormat);
}

答案 1 :(得分:0)

SpeechRecognitionEngine有SpeechDetected个事件。您可以使用它来确定何时识别要处理的下一个单词。

从上面链接的备注部分(强调我的):

  

每个语音识别器都有一个区分沉默和语音的算法。当SpeechRecognitionEngine执行语音识别操作时,当其算法将输入识别为语音时,它会引发SpeechDetected事件。相关SpeechDetectedEventArgs对象的AudioPosition属性指示识别器检测到语音的输入流中的位置。 SpeechRecognitionEngine引发SpeechDetected事件之前它引发任何SpeechHypothesized,SpeechRecognized或SpeechRecognitionRejected事件。