如何将ARPA文件与MS Speech一起使用? Microsoft Speech Platform 11识别器的文档暗示人们可以从ARPA文件中编译语法。
我能够使用以下命令行来编译ARPA文件-例如,小示例provided by Microsoft-
import UIKit
class MenuController: UIViewController {
var thingy = 0
@IBAction func ThingyBttn(_ sender: UIButton) {
if(thingy == 1) {
print("ok")
}else{
print("good")
let loginVC = UIStoryboard(name: "Main", bundle: nil).instantiateViewController(withIdentifier: "LoginViewController") as! MenuController
self.navigationController?.pushViewController(loginVC, animated: true)
}
}
@IBOutlet var myButton: UIButton!
@IBAction func Button(_ sender: UIButton) {
if let ButtonImage = myButton.image(for: .normal),
let Image = UIImage(named: "ButtonAppuyer.png"),
UIImagePNGRepresentation(ButtonImage) == UIImagePNGRepresentation(Image) {
thingy = 1
print("1")
} else {
thingy = 2
print("2")
}
}
}
我可以在以下测试中使用生成的CFG文件:
CompileGrammar.exe -In stock.arpa -InFormat ARPA
此测试通过,但请注意,它使用using Microsoft.Speech.Recognition;
// ...
using (var engine = new SpeechRecognitionEngine(new CultureInfo("en-US")))
{
engine.LoadGrammar(new Grammar("stock.cfg"));
var result = engine.EmulateRecognize("will stock go up");
Assert.That(result, Is.Not.Null);
}
。当我切换为使用实际的音频文件时,如下所示:
EmulateRecognize()
结果始终为空,并且测试失败。
Microsoft states quite clearly受支持,但即使是非常简单的示例也似乎不起作用。我在做什么错了?
答案 0 :(得分:3)
您的问题:
MS Speech Platform 11 Recognizer是否支持ARPA编译 语法?
答案是肯定的。
在我这边工作过的代码,只需更改三个属性: Culture / Grammar / WaveFile 。我不知道您的完整代码,但是根据我的测试和演示代码,我想根本原因是我们需要处理我们这边的 SpeechRecognized ,您可能并没有这样做。
static bool completed;
static void Main(string[] args)
{
// Initialize an in-process speech recognition engine.
using (SpeechRecognitionEngine recognizer =
new SpeechRecognitionEngine(new CultureInfo("en-us")))
{
// Create and load a grammar.
Grammar dictation = new Grammar("stock.cfg");
dictation.Name = "Dictation Grammar";
recognizer.LoadGrammar(dictation);
// Configure the input to the recognizer.
recognizer.SetInputToWaveFile("test.wav");
// Attach event handlers for the results of recognition.
recognizer.SpeechRecognized +=
new EventHandler<SpeechRecognizedEventArgs>(recognizer_SpeechRecognized);
recognizer.RecognizeCompleted +=
new EventHandler<RecognizeCompletedEventArgs>(recognizer_RecognizeCompleted);
// Perform recognition on the entire file.
Console.WriteLine("Starting asynchronous recognition...");
completed = false;
recognizer.RecognizeAsync();
// Keep the console window open.
while (!completed)
{
Console.ReadLine();
}
Console.WriteLine("Done.");
}
Console.WriteLine();
Console.WriteLine("Press any key to exit...");
Console.ReadKey();
}
// Handle the SpeechRecognized event.
static void recognizer_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
if (e.Result != null && e.Result.Text != null)
{
Console.WriteLine(" Recognized text = {0}", e.Result.Text);
}
else
{
Console.WriteLine(" Recognized text not available.");
}
}
// Handle the RecognizeCompleted event.
static void recognizer_RecognizeCompleted(object sender, RecognizeCompletedEventArgs e)
{
if (e.Error != null)
{
Console.WriteLine(" Error encountered, {0}: {1}",
e.Error.GetType().Name, e.Error.Message);
}
if (e.Cancelled)
{
Console.WriteLine(" Operation cancelled.");
}
if (e.InputStreamEnded)
{
Console.WriteLine(" End of stream encountered.");
}
completed = true;
}
wav的内容只是“ 库存增加”(持续时间约2秒)。
答案 1 :(得分:1)
根据所使用的Microsoft Speech SDK版本,此问题有两个不同的答案。 (请参阅:What is the difference between System.Speech.Recognition and Microsoft.Speech.Recognition? )
在这种情况下,请参见seiya1223's answer。那里的示例代码很棒。
也许是因为服务器版本不包含“听写引擎”,所以Microsoft.Speech库显然不会与来自ARPA的CFG 不匹配。但是,它仍然假设通过SpeechRecognitionRejected
事件说出的内容。以下是对seiya1223桌面代码的必要更改:
SpeechRecognitionRejected
事件添加事件处理程序。e.Result.Text
属性的最终假设。以下代码片段应有助于说明:
static string transcription;
static void Main(string[] args)
{
using (var recognizer = new SpeechRecognitionEngine(new CultureInfo("en-us")))
{
engine.SpeechRecognitionRejected += SpeechRecognitionRejectedHandler;
// ...
}
}
void SpeechRecognitionRejectedHandler(object sender, SpeechRecognitionRejectedEventArgs e)
{
if (e.Result != null && !string.IsNullOrEmpty(e.Result.Text))
transcription = e.Result.Text;
}
此处理程序在识别结束时被调用一次。例如,这是seiya1223代码的输出,但使用了所有可用的事件处理程序和大量额外的日志记录(强调我的意思):
开始异步识别...
在SpeechDetectedHandler中:
-AudioPosition = 00:00:01.2300000
在SpeechHypothesizedHandler中:
-语法名称=股票;结果文字=转到
在SpeechHypothesizedHandler中:
-语法名称=股票;结果文本=将
在SpeechHypothesizedHandler中:
-语法名称=股票;结果文本=存货
在SpeechHypothesizedHandler中:
-语法名称=股票;结果文本=将去库存
在SpeechHypothesizedHandler中:
-语法名称=股票;结果文字=库存会上涨
在SpeechRecognitionRejectedHandler中:
-语法名称=股票;结果文本=库存会上涨
在RecognizeCompletedHandler中。
-AudioPosition = 00:00:03.2000000; InputStreamEnded = True
-没有结果。
完成。