使用Google的Cloud Speech并发送语音上下文时,返回的转录没有返回预期的结果

时间:2019-03-25 12:56:41

标签: google-cloud-speech

请参阅:https://issuetracker.google.com/u/1/issues/128352542

我们遇到一个问题,即在用户语音上下文中添加的某些单词没有被返回或确定优先级。

使用短语提示时,API通常会正确转录发话时提供的短语或单词,但是无论您如何将它们添加到短语提示中,某些单词都不会被转录。

在StreamingRecognitionConfig内部发送的配置:

{  
   "config":{  
      "encoding":"LINEAR16",
      "sampleRateHertz":8000,
      "languageCode":"en-US",
      "enableWordTimeOffsets":true,
      "enableAutomaticPunctuation":false,
      "model":"default",
      "useEnhanced":true,
      "speechContexts":[  
         {  
            "phrases":[  
               "Bill Uhma",
               "Uhma",
               "I got coffee with Bill Uhma"
            ]
         }
      ]
   }
}

尝试说“我和比尔·乌玛喝咖啡”时的结果:

{
   "results":{
      "alternatives":[
         {
            "confidence":0.8440007,
            "transcript":"I got coffee with Bill Uma",
            "words":[
               {
                  "confidence":0.847875,
                  "word":"I"
               },
               {
                  "confidence":0.9265712,
                  "word":"got"
               },
               {
                  "confidence":0.98762906,
                  "word":"coffee"
               },
               {
                  "confidence":0.98762906,
                  "word":"with"
               },
               {
                  "confidence":0.9239746,
                  "word":"Bill"
               },
               {
                  "confidence":0.23432566,
                  "word":"Uma"
               }
            ]
         },
         {
            "confidence":0.94561315,
            "transcript":"I got coffee with Bill Luma"
         },
         {
            "confidence":0.911253,
            "transcript":"I got coffee with Bill Guma"
         },
         {
            "confidence":0.91219664,
            "transcript":"I got coffee with Bill Houma"
         },
         {
            "confidence":0.94028026,
            "transcript":"I got coffee with Bill looma"
         },
         {
            "confidence":0.9403957,
            "transcript":"I got coffee with Bill bouma"
         },
         {
            "confidence":0.9403957,
            "transcript":"I got coffee with Bill goomah"
         },
         {
            "confidence":0.9403957,
            "transcript":"I got coffee with Bill Wilma"
         },
         {
            "confidence":0.938467,
            "transcript":"I got coffee with Bill Boomer"
         },
         {
            "confidence":0.9403957,
            "transcript":"I got coffee with Bill buma"
         },
         {
            "confidence":0.9403957,
            "transcript":"I got coffee with Bill Ooma"
         },
         {
            "confidence":0.9403957,
            "transcript":"I got coffee with Bill Gooma"
         }
      ],
      "confidence":0.8440007,
      "is_final":true,
      "transcription":"I got coffee with Bill Uma"
   }
}

收到的抄写是“我和比尔·乌玛一起喝咖啡”。

预期的文字是“我和比尔·乌玛(Bill Uhma)一起喝咖啡”。

从结果中可以看出,所提供的提示不会出现在收到的12种选择中的任何一种。

分隔短语提示并仅发送其中之一对结果没有影响。

1 个答案:

答案 0 :(得分:0)

这实际上不是bug,而应视为强制要求识别器使用提供的phrases/hints的功能请求,特别是如果短语中的词不存在时。如果您会注意到,“ Uma”一词的置信度很低,这可能表明识别器无法理解它(不是在其词汇中)。

File feature request here