Watson演讲以文本直播C#代码为例

时间:2017-09-12 14:46:43

标签: c# stream speech-recognition speech-to-text ibm-watson

我试图在C#中构建一个应用音频流的应用程序(现在来自文件,但后来它将是一个网络流)并在Watson可用时实时返回转录,类似于https://speech-to-text-demo.mybluemix.net/

的演示

有谁知道我在哪里可以找到一些示例代码,最好是在C#中,这可以帮助我开始?

我根据https://github.com/watson-developer-cloud/dotnet-standard-sdk/tree/development/src/IBM.WatsonDeveloperCloud.SpeechToText.v1的有限文档尝试了这一点,但是当我调用RecognizeWithSession时出现BadRequest错误。我不确定我是否走在正确的道路上。

    static void StreamingRecognize(string filePath)
    {
        SpeechToTextService _speechToText = new SpeechToTextService();
        _speechToText.SetCredential(<user>, <pw>);
        var session = _speechToText.CreateSession("en-US_BroadbandModel");

        //returns initialized
        var recognizeStatus = _speechToText.GetSessionStatus(session.SessionId);

        //  set up observe
        var taskObserveResult = Task.Factory.StartNew(() =>
        {
            var result = _speechToText.ObserveResult(session.SessionId);
            return result;
        });

        //  get results
        taskObserveResult.ContinueWith((antecedent) =>
        {
            var results = antecedent.Result;
        });

        var metadata = new Metadata();
        metadata.PartContentType = "audio/wav";
        metadata.DataPartsCount = 1;
        metadata.Continuous = true;
        metadata.InactivityTimeout = -1;
        var taskRecognizeWithSession = Task.Factory.StartNew(() =>
        {
            using (FileStream fs = File.OpenRead(filePath))
            {
                _speechToText.RecognizeWithSession(session.SessionId, "audio/wav", metadata, fs, "chunked");
            }
        });
    }

1 个答案:

答案 0 :(得分:0)

在Watson Developer Cloud - SDK中,在您的编程语言中,您可以看到一个名为Examples的文件夹,您可以访问使用Speech to Text的示例。

SDK支持WebSockets,可满足您转录更实时与上传音频文件的要求。

static void Main(string[] args)
        {
            Transcribe();
            Console.WriteLine("Press any key to exit");
            Console.ReadLine();
        }

        // http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/getting_started/gs-credentials.shtml
        static String username = "<username>";
        static String password = "<password>";

        static String file = @"c:\audio.wav";

        static Uri url = new Uri("wss://stream.watsonplatform.net/speech-to-text/api/v1/recognize");

        // these should probably be private classes that use DataContractJsonSerializer 
        // see https://msdn.microsoft.com/en-us/library/bb412179%28v=vs.110%29.aspx
        // or the ServiceState class at the end
        static ArraySegment<byte> openingMessage = new ArraySegment<byte>( Encoding.UTF8.GetBytes(
            "{\"action\": \"start\", \"content-type\": \"audio/wav\", \"continuous\" : true, \"interim_results\": true}"
        ));
        static ArraySegment<byte> closingMessage = new ArraySegment<byte>(Encoding.UTF8.GetBytes(
            "{\"action\": \"stop\"}"
        ));
        // ... more in the link below
  • 访问SDK C#here
  • 有关详情,请参阅API参考here
  • IBM Watson Developer here使用Speech to Text的一个完整示例。