MS Speech Platform 11识别器是否支持ARPA编译语法?

时间:2018-09-18 14:42:49

标签: speech-recognition sapi microsoft-speech-platform

如何将ARPA文件与MS Speech一起使用? Microsoft Speech Platform 11识别器的文档暗示人们可以从ARPA文件中编译语法。

我能够使用以下命令行来编译ARPA文件-例如,小示例provided by Microsoft-

import UIKit

class MenuController: UIViewController {

    var thingy = 0

    @IBAction func ThingyBttn(_ sender: UIButton) {
        if(thingy == 1) {
            print("ok")
        }else{
            print("good")

            let loginVC = UIStoryboard(name: "Main", bundle: nil).instantiateViewController(withIdentifier: "LoginViewController") as! MenuController
            self.navigationController?.pushViewController(loginVC, animated: true)
        }
    }

    @IBOutlet var myButton: UIButton!

    @IBAction func Button(_ sender: UIButton) {
        if let ButtonImage = myButton.image(for: .normal),
            let Image = UIImage(named: "ButtonAppuyer.png"),
            UIImagePNGRepresentation(ButtonImage) == UIImagePNGRepresentation(Image) {
            thingy = 1

            print("1")                
        } else {
            thingy = 2

            print("2")
        }   
    }

}

我可以在以下测试中使用生成的CFG文件:

CompileGrammar.exe -In stock.arpa -InFormat ARPA

此测试通过,但请注意,它使用using Microsoft.Speech.Recognition; // ... using (var engine = new SpeechRecognitionEngine(new CultureInfo("en-US"))) { engine.LoadGrammar(new Grammar("stock.cfg")); var result = engine.EmulateRecognize("will stock go up"); Assert.That(result, Is.Not.Null); } 。当我切换为使用实际的音频文件时,如下所示:

EmulateRecognize()

结果始终为空,并且测试失败。

Microsoft states quite clearly受支持,但即使是非常简单的示例也似乎不起作用。我在做什么错了?

2 个答案:

答案 0 :(得分:3)

您的问题:

  

MS Speech Platform 11 Recognizer是否支持ARPA编译   语法?

答案是肯定的。

在我这边工作过的代码,只需更改三个属性: Culture / Grammar / WaveFile 。我不知道您的完整代码,但是根据我的测试和演示代码,我想根本原因是我们需要处理我们这边的 SpeechRecognized ,您可能并没有这样做。

static bool completed;

        static void Main(string[] args)  
        {
            // Initialize an in-process speech recognition engine.  
            using (SpeechRecognitionEngine recognizer =
               new SpeechRecognitionEngine(new CultureInfo("en-us")))
            {

                // Create and load a grammar.   
                Grammar dictation = new Grammar("stock.cfg");
                dictation.Name = "Dictation Grammar";

                recognizer.LoadGrammar(dictation);

                // Configure the input to the recognizer.  
                recognizer.SetInputToWaveFile("test.wav");

                // Attach event handlers for the results of recognition.  
                recognizer.SpeechRecognized +=
                  new EventHandler<SpeechRecognizedEventArgs>(recognizer_SpeechRecognized);
                recognizer.RecognizeCompleted +=
                  new EventHandler<RecognizeCompletedEventArgs>(recognizer_RecognizeCompleted);

                // Perform recognition on the entire file.  
                Console.WriteLine("Starting asynchronous recognition...");
                completed = false;
                recognizer.RecognizeAsync();

                // Keep the console window open.  
                while (!completed)
                {
                    Console.ReadLine();
                }
                Console.WriteLine("Done.");
            }

            Console.WriteLine();
            Console.WriteLine("Press any key to exit...");
            Console.ReadKey();
        }

        // Handle the SpeechRecognized event.  
        static void recognizer_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
        {
            if (e.Result != null && e.Result.Text != null)
            {
                Console.WriteLine("  Recognized text =  {0}", e.Result.Text);
            }
            else
            {
                Console.WriteLine("  Recognized text not available.");
            }
        }

        // Handle the RecognizeCompleted event.  
        static void recognizer_RecognizeCompleted(object sender, RecognizeCompletedEventArgs e)
        {
            if (e.Error != null)
            {
                Console.WriteLine("  Error encountered, {0}: {1}",
                e.Error.GetType().Name, e.Error.Message);
            }
            if (e.Cancelled)
            {
                Console.WriteLine("  Operation cancelled.");
            }
            if (e.InputStreamEnded)
            {
                Console.WriteLine("  End of stream encountered.");
            }
            completed = true;
        }

enter image description here enter image description here

wav的内容只是“ 库存增加”(持续时间约2秒)。

  

有关更多信息:https://docs.microsoft.com/en-us/dotnet/api/system.speech.recognition.speechrecognitionengine.setinputtowavefile?redirectedfrom=MSDN&view=netframework-4.7.2#System_Speech_Recognition_SpeechRecognitionEngine_SetInputToWaveFile_System_String_

答案 1 :(得分:1)

根据所使用的Microsoft Speech SDK版本,此问题有两个不同的答案。 (请参阅:What is the difference between System.Speech.Recognition and Microsoft.Speech.Recognition?

System.Speech(台式机版本)

在这种情况下,请参见seiya1223's answer。那里的示例代码很棒。

Microsoft.Speech(服务器版本)

也许是因为服务器版本不包含“听写引擎”,所以Microsoft.Speech库显然不会与来自ARPA的CFG 不匹配。但是,它仍然假设通过SpeechRecognitionRejected事件说出的内容。以下是对seiya1223桌面代码的必要更改:

  1. 将您的using语句从System.Speech更改为Microsoft.Speech。
  2. SpeechRecognitionRejected事件添加事件处理程序。
  3. 在事件处理程序中,检查e.Result.Text属性的最终假设。

以下代码片段应有助于说明:

static string transcription;

static void Main(string[] args)  
{
  using (var recognizer = new SpeechRecognitionEngine(new CultureInfo("en-us")))
  {
    engine.SpeechRecognitionRejected += SpeechRecognitionRejectedHandler;
    // ...
  }
}

void SpeechRecognitionRejectedHandler(object sender, SpeechRecognitionRejectedEventArgs e)
{
  if (e.Result != null && !string.IsNullOrEmpty(e.Result.Text))
    transcription = e.Result.Text;
}

此处理程序在识别结束时被调用一次。例如,这是seiya1223代码的输出,但使用了所有可用的事件处理程序和大量额外的日志记录(强调我的意思):

  

开始异步识别...
   在SpeechDetectedHandler中:
   -AudioPosition = 00:00:01.2300000
   在SpeechHypothesizedHandler中:
   -语法名称=股票;结果文字=转到
   在SpeechHypothesizedHandler中:
   -语法名称=股票;结果文本=将
   在SpeechHypothesizedHandler中:
   -语法名称=股票;结果文本=存货
   在SpeechHypothesizedHandler中:
   -语法名称=股票;结果文本=将去库存
   在SpeechHypothesizedHandler中:
   -语法名称=股票;结果文字=库存会上涨
   在SpeechRecognitionRejectedHandler中:
   -语法名称=股票;结果文本=库存会上涨

   在RecognizeCompletedHandler中。
   -AudioPosition = 00:00:03.2000000; InputStreamEnded = True
   -没有结果。
  完成。