Question

我想知道cognitives的c ++ sdk是否有可能将语音转换为文本以将数字实体返回为文本而不是数字。

当前回复“我要订购2杯可乐” 预期响应“我要订购两个可乐”

我当然可以为翻译实现一个功能。但是我想知道服务是否已经提供了它。尤其是西班牙语。

Answer 1

查看位于https://github.com/Azure-Samples/cognitive-services-speech-sdk的示例存储库

尤其是文件Speech_recognition_samples.cpp，函数SpeechRecognitionWithLanguageAndUsingDetailedOutputFormat

启用“详细输出”将为您提供所需的结果：

config->SetOutputFormat(OutputFormat::Detailed);

然后，您需要查看详细的输出：

result->Properties.GetProperty(PropertyId::SpeechServiceResponse_JsonResult)

这将创建如下的详细输出：

{"Duration":35500000,"NBest":[{"Confidence":0.7535948753356934,"Display":"I want to order 2 Cokes.","ITN":"I want to order 2 cokes","Lexical":"i want to order two cokes","MaskedITN":"I want to order 2 cokes"}],"Offset":17000000,"RecognitionStatus":"Success"}

词法输出可能就是您想要的

狼帮

Azure Cognitives服务语音到数字实体作为文本的文本识别

1 个答案: