当省略“可选”项时,如何从System.Speech.Recognition返回默认语义?

时间:2011-04-18 15:23:30

标签: c# .net speech-recognition speech

我是一名中级C#程序员,但却是System.Speech的绝对初学者。我正在通过一些例子来了解API是如何工作的,我已经挂了第一个例子......我想要做的是有一个语法,它返回一个或多个预期选择的默认语义如果用户没有明确地为其中一个选项提供值。 (对不起,如果我的术语不太正确......)我在Windows Vista上使用Visual Studio 2010(试用版),并安装了.NET 4.0。

我从以下文章中的“披萨订购”示例开始,这似乎在论坛上出现了很多:

http://msdn.microsoft.com/en-us/magazine/cc163663.aspx#S5

我开始使用的代码在该文章的图9中。不幸的是,由于某种原因(可能从一个版本的SAPI更改为下一个版本?),许多函数调用在.NET 4.0 / SAPI 5.3中实际上并不有效,例如GrammarBuilder.AppendChoices()和GrammarBuilder.AppendResultKeyValue() 。如果用户仅指定浇头(即“奶酪披萨,请”手背尺寸=大,外壳=厚,顶部=),后一种调用应该为我们提供“尺寸”和“外壳”键的默认选择。奶酪)...所以我想弄清楚如何使这项工作。

这是我的代码的相关部分(它应该只是上面文章中代码的重写):

// [I'd like] a [< size >] [< crust >] [< topping >] pizza [please]

// build the core set of choices  
GrammarBuilder grbSizes = new GrammarBuilder(new Choices("small", "regular", "large"));  
GrammarBuilder grbCrusts = new GrammarBuilder(new Choices("thin crust", "thick crust"));  
GrammarBuilder grbToppings = new GrammarBuilder(new Choices("vegetarian", "pepperoni", "cheese"));    

// Wrap them in semantic result keys
SemanticResultKey skeySize = new SemanticResultKey("size", grbSizes);  
SemanticResultKey skeyCrust = new SemanticResultKey("crust", grbCrusts);  
SemanticResultKey skeyTopping = new SemanticResultKey("topping", grbToppings);  

// And some default values for later on...    
SemanticResultKey skeyDefaultSize = new SemanticResultKey("size", new GrammarBuilder(new SemanticResultValue("large")));  
SemanticResultKey skeyDefaultCrust = new SemanticResultKey("crust", new GrammarBuilder(new SemanticResultValue("thick crust")));  


// [...snip...]  
// Here's the builder for one of several sub-grammars, the one with two default  
// values... This should allow "cheese" in "A cheese pizza" to be intepreted  
// as large+thick-crust+cheese

//choose topping only, and assume the rest
GrammarBuilder toppingOnly = new GrammarBuilder();
toppingOnly += skeyTopping;
toppingOnly += skeyDefaultSize;
toppingOnly += skeyDefaultCrust;

// [...snip...]
// Later code builds up the full pattern just as in the original article

我知道the MSDN page for the SemanticResultKey constructor包含一个警告:“构建器参数指定的GrammarBuilder对象中应该只有一个,只有一个未标记的SemanticResultValue实例”,否则你我会得到一个例外。事实上,当我在识别器中说出“A cheese pizza”这样的东西时,我确实得到了一个TargetInvocationException。

所以我的第一个问题是,有人可以向我解释这里发生了什么吗?我不一定要在这里应用这个约束,因为(a)我认为我对skeyDefaultSize和skeyDefaultCrust的声明确实将SemanticResultValues与SemanticResultKeys相关联,所以这些值不应被视为“未标记”; (b)所讨论的两个SemanticResultValues实际上来自不同的GrammarBuilders,而这些GrammarBuilder又位于不同的SemanticResultKeys内,这似乎不是MSDN页面上描述的场景。

然后我的第二个问题是,为什么以下代码有效?唯一的区别是我重新排序了一些行,这样两个“默认”键就不会连续地附加到语法中。

//choose topping only, and assume the rest
GrammarBuilder toppingOnly = new GrammarBuilder();
toppingOnly += skeyDefaultSize;
toppingOnly += skeyTopping;
toppingOnly += skeyDefaultCrust;

当我说例如,这给出了确切的期望结果。 “奶酪披萨” - 我在SpeechRecognized处理程序中捕获的SemanticValue中存在所有键(“size”,“crust”,“topping”),其中包含size和crust所需的默认值以及用户指定的值顶部的价值。

我想第三个也是最重要的问题是:有没有办法正确地做到这一点?显然,调整附加顺序太“神奇”,并不总是一个可行的解决方案。

抱歉这个大问题,非常感谢你的帮助!

1 个答案:

答案 0 :(得分:1)

我从MSDN文章中学到了同样的问题。我不知道我的解决方案是“最好的”,但这是我更新Pizza语法并处理默认选择的方式。

首先,这是我创建披萨语法的方式:

private Grammar CreatePizzaGrammar()
{
    //create the pizza grammar
    GrammarBuilder pizzaRequest = CreatePizzaGrammarBuilder();
    Grammar pizzaGrammar = new Grammar(pizzaRequest);
    return pizzaGrammar;
}

private GrammarBuilder CreatePizzaGrammarBuilder()
{
    // this is adapted from the sample in http://msdn.microsoft.com/en-us/magazine/cc163663.aspx
    // but the API changed before Vista was released so some changes were made.

    //[I'd like] a [<size>] [<crust>] [<topping>] pizza [please]

    //build the core set of choices

    // size
    Choices sizes = new Choices();
    SemanticResultValue sizeSRV;
    sizeSRV = new SemanticResultValue("small", "small");
    sizes.Add(sizeSRV);
    sizeSRV = new SemanticResultValue("regular", "regular");
    sizes.Add(sizeSRV);
    sizeSRV = new SemanticResultValue("medium", "regular");
    sizes.Add(sizeSRV);
    sizeSRV = new SemanticResultValue("large", "large");
    sizes.Add(sizeSRV);
    SemanticResultKey sizeSemKey = new SemanticResultKey("size", sizes);

    // crust
    Choices crusts = new Choices();
    SemanticResultValue crustSRV;
    crustSRV = new SemanticResultValue("thin crust", "thin crust");
    crusts.Add(crustSRV);
    crustSRV = new SemanticResultValue("thin", "thin crust");
    crusts.Add(crustSRV);
    crustSRV = new SemanticResultValue("thick crust", "thick crust");
    crusts.Add(crustSRV);
    crustSRV = new SemanticResultValue("thick", "thick crust");
    crusts.Add(crustSRV);
    SemanticResultKey crustSemKey = new SemanticResultKey("crust", crusts);

    // toppings
    Choices toppings = new Choices();
    SemanticResultValue toppingSRV;
    toppingSRV = new SemanticResultValue("vegetarian", "vegetarian");
    toppings.Add(toppingSRV);
    toppingSRV = new SemanticResultValue("veggie", "vegetarian");
    toppings.Add(toppingSRV);
    toppingSRV = new SemanticResultValue("pepperoni", "pepperoni");
    toppings.Add(toppingSRV);
    toppingSRV = new SemanticResultValue("cheese", "cheese");
    toppings.Add(toppingSRV);
    toppingSRV = new SemanticResultValue("plain", "cheese");
    toppings.Add(toppingSRV);
    SemanticResultKey toppingSemKey = new SemanticResultKey("topping", toppings);

    //build the permutations of choices...

    // 1. choose all three
    GrammarBuilder sizeCrustTopping = new GrammarBuilder();
    sizeCrustTopping.Append(sizeSemKey);
    sizeCrustTopping.Append(crustSemKey);
    sizeCrustTopping.Append(toppingSemKey);

    // 2. choose size and topping
    GrammarBuilder sizeAndTopping = new GrammarBuilder();
    sizeAndTopping.Append(sizeSemKey);
    sizeAndTopping.Append(toppingSemKey);
    // sizeAndTopping.Append(new SemanticResultKey("crust", "thick crust"));
    // sizeAndTopping.AppendResultKeyValue("crust", "thick crust");

    // 3. choose size and crust, and assume cheese
    GrammarBuilder sizeAndCrust = new GrammarBuilder();
    sizeAndCrust.Append(sizeSemKey);
    sizeAndCrust.Append(crustSemKey);

    // 4. choose topping and crust, and assume cheese
    GrammarBuilder toppingAndCrust = new GrammarBuilder();
    toppingAndCrust.Append(crustSemKey);
    toppingAndCrust.Append(toppingSemKey);


    // 5. choose topping only, and assume the rest
    GrammarBuilder toppingOnly = new GrammarBuilder();
    toppingOnly.Append(toppingSemKey);         //, "topping");

    // 6. choose size only, and assume the rest
    GrammarBuilder sizeOnly = new GrammarBuilder();
    sizeOnly.Append(sizeSemKey);

    // 7. choose crust only, and assume the rest
    GrammarBuilder crustOnly = new GrammarBuilder();
    crustOnly.Append(crustSemKey);


    //assemble the permutations             
    Choices permutations = new Choices();
    permutations.Add(sizeCrustTopping);
    permutations.Add(sizeAndTopping);
    permutations.Add(sizeAndCrust);
    permutations.Add(toppingAndCrust);
    permutations.Add(toppingOnly);
    permutations.Add(sizeOnly);
    permutations.Add(crustOnly);

    GrammarBuilder permutationList = new GrammarBuilder();
    permutationList.Append(permutations);

    //now build the complete pattern...
    GrammarBuilder pizzaRequest = new GrammarBuilder();
    //pre-amble "[I'd like] a"
    pizzaRequest.Append(new Choices("I'd like a", "a", "I need a", "I want a"));
    //permutations "[<size>] [<crust>] [<topping>]"
    pizzaRequest.Append(permutationList, 0, 1);
    //post-amble "pizza [please]"
    pizzaRequest.Append(new Choices("pizza", "pizza please", "pie", "pizza pie"));

    return pizzaRequest;
}

然后我将SpeechRecognized事件的事件处理程序设置为:

void recognizer_SpeechRecognizedPizza(object sender, SpeechRecognizedEventArgs e)
{

    // set the default semantic key values if the result does not include these
    string size = "regular";
    string crust = "thick crust";
    string topping = "cheese";

    if (e.Result.Semantics != null && e.Result.Semantics.Count != 0)
    {
        if (e.Result.Semantics.ContainsKey("size"))
        {
            size = e.Result.Semantics["size"].Value.ToString();
            AppendTextOuput(String.Format("\r\n  Size = {0}.", size));
        }

        if (e.Result.Semantics.ContainsKey("crust"))
        {
            crust = e.Result.Semantics["crust"].Value.ToString();
            AppendTextOuput(String.Format("\r\n  Crust = {0}.", crust));
        }

        if (e.Result.Semantics.ContainsKey("topping"))
        {
            topping = e.Result.Semantics["topping"].Value.ToString();
            AppendTextOuput(String.Format("\r\n  Topping = {0}.", topping));
        }
    }
    String sOutput = String.Format("\r\nRecognized: You have orderd a {0}, {1}, {2} pizza.", size, crust, topping);
    AppendTextOuput(sOutput);
}

AppendTextOuput只是我自己的小输出字符串方法。

在语法中明确布局所有可能的排列似乎需要做很多工作。但是,它的效果非常好。

正如您所看到的,我最终避免了让语法提供默认值并将其简单地构建到事件处理程序中的问题。可能有更好的方法。

了解更多内容的另一个步骤是使用SrgsDocument.WriteSrgs()method并写出表示语法的SRGS XML文档。规则和语义标签更容易在XML中可视化。