我的应用程序的一个用例是将语音(单个单词的发音)转换为文本。我需要为此使用Azure语音文本。有时语音需要转换为整数-例如,我需要以数量形式提交响应。 我的问题是,无论如何,是否通过REST API将语音告诉文本服务我想要一个数字结果?当前,它正在返回诸如“一个”而不是“ 1”和“免费”而不是“ 3”的东西。我认为文档中没有办法做到这一点,但我想看看其他人是否已解决此问题,然后再想办法解决。 这是我在概念证明项目中使用的代码:
public static async Task SpeechToTextAsync(MemoryStream data, ISpeechResultCallback callBack)
{
string accessToken = await Authentication.GetAccessToken();
IToast toastWrapper = DependencyService.Get<IToast>();
if (accessToken != null)
{
toastWrapper.Show("Acquired token");
callBack.SpeechReturned("Acquired token");
using (var client = new HttpClient())
{
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create("https://westus.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?language=en-GB&format=detailed");
request.SendChunked = true;
request.Accept = @"application/json;text/xml";
request.Method = "POST";
request.ProtocolVersion = HttpVersion.Version11;
request.Host = "westus.stt.speech.microsoft.com";
request.ContentType = @"audio/wav; codecs=audio/pcm; samplerate=16000";
// request.Headers["Ocp-Apim-Subscription-Key"] = Program.SubscriptionKey;
request.Headers.Add("Authorization", "Bearer " + accessToken);
request.AllowWriteStreamBuffering = false;
data.Position = 0;
byte[] buffer = null;
int bytesRead = 0;
using (Stream requestStream = request.GetRequestStream())
{
buffer = new Byte[checked((uint)Math.Min(1024, (int)data.Length))];
while ((bytesRead = data.Read(buffer, 0, buffer.Length)) != 0)
{
requestStream.Write(buffer, 0, bytesRead);
}
// Flush
requestStream.Flush();
}
try
{
string responseData = null;
using (WebResponse response = request.GetResponse())
{
var encoding = Encoding.GetEncoding(((HttpWebResponse)response).CharacterSet);
using (var responseStream = response.GetResponseStream())
{
using (var reader = new StreamReader(responseStream, encoding))
{
responseData = reader.ReadToEnd();
AzureSTTResults deserializedProduct = JsonConvert.DeserializeObject<AzureSTTResults>(responseData);
if(deserializedProduct == null || deserializedProduct.NBest == null || deserializedProduct.NBest.Length == 0)
{
toastWrapper.Show("No results");
callBack.SpeechReturned("No results");
}
else
{
toastWrapper.Show(deserializedProduct.NBest[0].ITN);
callBack.SpeechReturned(deserializedProduct.NBest[0].ITN);
}
}
}
}
}
catch (Exception ex)
{
toastWrapper.Show(ex.Message);
callBack.SpeechReturned(ex.Message);
}
}
}
else
{
toastWrapper.Show("No token required");
callBack.SpeechReturned("No token required");
}
}
这是我想成为“ 1”的结果的示例:
{
"RecognitionStatus": "Success",
"Offset": 0,
"Duration": 22200000,
"NBest": [
{
"Confidence": 0.43084684014320374,
"Lexical": "one",
"ITN": "One",
"MaskedITN": "One",
"Display": "One."
}
]
}
答案 0 :(得分:1)
我建议使用Microsoft的nuget。它像一种魅力,here是一个例子。
NumberRecognizer.RecognizeNumber("I have two apples", Culture.English)
答案 1 :(得分:0)
根据官方文档Speech-to-text REST API
,没有任何选项可以帮助将数字单词转换为数字。
考虑到英语中的数字单词具有语法模式,您可以使用简单的算法来实现将单词转换为数字的功能。作为参考,您可以按照以下说明,用C#自己编写一个。
希望有帮助。