我试图在C#中构建一个应用音频流的应用程序(现在来自文件,但后来它将是一个网络流)并在Watson可用时实时返回转录,类似于https://speech-to-text-demo.mybluemix.net/
的演示有谁知道我在哪里可以找到一些示例代码,最好是在C#中,这可以帮助我开始?
我根据https://github.com/watson-developer-cloud/dotnet-standard-sdk/tree/development/src/IBM.WatsonDeveloperCloud.SpeechToText.v1的有限文档尝试了这一点,但是当我调用RecognizeWithSession时出现BadRequest错误。我不确定我是否走在正确的道路上。
static void StreamingRecognize(string filePath)
{
SpeechToTextService _speechToText = new SpeechToTextService();
_speechToText.SetCredential(<user>, <pw>);
var session = _speechToText.CreateSession("en-US_BroadbandModel");
//returns initialized
var recognizeStatus = _speechToText.GetSessionStatus(session.SessionId);
// set up observe
var taskObserveResult = Task.Factory.StartNew(() =>
{
var result = _speechToText.ObserveResult(session.SessionId);
return result;
});
// get results
taskObserveResult.ContinueWith((antecedent) =>
{
var results = antecedent.Result;
});
var metadata = new Metadata();
metadata.PartContentType = "audio/wav";
metadata.DataPartsCount = 1;
metadata.Continuous = true;
metadata.InactivityTimeout = -1;
var taskRecognizeWithSession = Task.Factory.StartNew(() =>
{
using (FileStream fs = File.OpenRead(filePath))
{
_speechToText.RecognizeWithSession(session.SessionId, "audio/wav", metadata, fs, "chunked");
}
});
}
答案 0 :(得分:0)
在Watson Developer Cloud - SDK中,在您的编程语言中,您可以看到一个名为Examples的文件夹,您可以访问使用Speech to Text的示例。
SDK支持WebSockets,可满足您转录更实时与上传音频文件的要求。
static void Main(string[] args)
{
Transcribe();
Console.WriteLine("Press any key to exit");
Console.ReadLine();
}
// http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/doc/getting_started/gs-credentials.shtml
static String username = "<username>";
static String password = "<password>";
static String file = @"c:\audio.wav";
static Uri url = new Uri("wss://stream.watsonplatform.net/speech-to-text/api/v1/recognize");
// these should probably be private classes that use DataContractJsonSerializer
// see https://msdn.microsoft.com/en-us/library/bb412179%28v=vs.110%29.aspx
// or the ServiceState class at the end
static ArraySegment<byte> openingMessage = new ArraySegment<byte>( Encoding.UTF8.GetBytes(
"{\"action\": \"start\", \"content-type\": \"audio/wav\", \"continuous\" : true, \"interim_results\": true}"
));
static ArraySegment<byte> closingMessage = new ArraySegment<byte>(Encoding.UTF8.GetBytes(
"{\"action\": \"stop\"}"
));
// ... more in the link below