SpeechRecognitionEngine识别不在语法中的单词

时间:2013-12-11 12:32:06

标签: c# .net speech-recognition grammar

我正在尝试使用SpeechRecognitionEngine构建语音识别程序。我想要的功能之一,我可以google我所说的短语。我应该说:“Google +我的短语”。但问题是SpeechRecognitionEngine只识别我将在Grammar中添加的单词,因此它无法识别我的短语。我如何才能完成此功能?

2 个答案:

答案 0 :(得分:2)

这是可能的,但更难以实施。基本上,您将系统置于持续的聆听/听写模式,并为每个已识别的短语选择要做的事情。

找到了您可能想要尝试的答案: C# SAPI - Recognizing phrases without pre-defined condition statements

这是一个简单的例子(这是.net 4.5,WinForms),希望它有所帮助

using System;
using System.Windows.Forms;
using System.Speech.Recognition;
using System.Speech.Synthesis;
using System.Diagnostics;
using System.Text.RegularExpressions;

public partial class Main : Form
{
    private SpeechRecognitionEngine listener;
    private SpeechSynthesizer speaker;      

    public Main()
    {
        InitializeComponent();
    }

    private void Main_Load(object sender, EventArgs e)
    {
        speaker = new SpeechSynthesizer();
        speaker.SelectVoice("Microsoft David Desktop");

        GrammarBuilder builder = new GrammarBuilder();
        builder.AppendDictation();

        Grammar grammar = new Grammar(builder);

        listener = new SpeechRecognitionEngine();
        listener.LoadGrammar(grammar);
        listener.SetInputToDefaultAudioDevice();
        listener.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(listener_SpeechRecognized);
        listener.RecognizeAsync(RecognizeMode.Multiple);
    }

    void listener_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
    {
        string commandName = e.Result.Text;

        switch (commandName.ToLower())
        {
            case "exit":

                speaker.Speak("talk to you later, bye");
                Application.Exit();
                break;

            case "stop speaking":
            case "stop talking":
            case "be quiet":
            case "silence":

                speaker.SpeakAsyncCancelAll();
                break;

            case "hello":

                speaker.SpeakAsync("hello, how are you doing?");
                break;

            case "i'm fine":

                speaker.SpeakAsync("i am glad to hear that");
                break;

            case "thank you":

                speaker.SpeakAsync("you're welcome");
                break;

            case "thanks":

                speaker.SpeakAsync("no problem");
                break;

            case "what's the time":
            case "what time is it":
            case "time":

                speaker.SpeakAsync(DateTime.Now.ToShortTimeString());

                break;

            case "what's the date":
            case "what day is it":
            case "what is today's date":
            case "what is today":
            case "today":

                speaker.SpeakAsync(DateTime.Today.ToString("dddd, MMMM d, yyyy"));

                break;

            case "do read it":

                Process.Start("http://www.reddit.com");
                break;

            case "do face book":

                Process.Start("http://www.facebook.com");
                break;

            case "do search":

                Process.Start("http://www.google.com");
                break;

            case "do videos":

                Process.Start("http://www.youtube.com");
                break;

            default:

                //handle non-normalized recognition
                Match m = Regex.Match(commandName, "YOUR_PATTERN_HERE");

                if (m.Success)
                {
                    speaker.SpeakAsync("I found a match");

                    //example, probably should URL encode the value...
                    //Process.Start("http://www.google.com?q=" + m.Value);
                }

                break;
        }
    }

    private void Main_FormClosing(object sender, FormClosingEventArgs e)
    {
        //be sure to clean up!
        listener.UnloadAllGrammars();
        listener.Dispose();
        listener = null;

        speaker.Dispose();
        speaker = null;

        grammar = null;
    }
}

答案 1 :(得分:0)

目前的语音技术阶段无法做到这一点。你能做的最好的事情就是让语法足够大,这样你就会觉得所有单词都被识别出来了,但是准确度会大大降低。