Question

我们已经成功地使用Google的Java client library with the speech API通过SeparateRecognitionPerChannel和SpeakerDiarization检测音频文件中的扬声器。但是，从上周开始，来自Google的回复不再返回频道标签或扬声器标签。

这是我们用于单声道音频文件的RecognitionConfig：

RecognitionConfig config = RecognitionConfig.newBuilder()
    /* other settings ... */
    .setEnableSpeakerDiarization(true)
    .setDiarizationSpeakerCount(2)
    .build()

对于双通道文件：

RecognitionConfig config = RecognitionConfig.newBuilder()
    /* other settings ... */
    .setAudioChannelCount(2)
    .setEnableSeparateRecognitionPerChannel(true)
    .build()

然后我们使用此配置调用longRunningRecognizeAsync并等待我们处理如下的响应：

List <SpeechRecognitionResult> response = response.get().getResultsList()

response.each { speechRecognitionResult ->

    println(speechRecognitionResult.getChannelTag())  // !! Always outputs 0

    SpeechRecognitionAlternative alternative = speechRecognitionResult.getAlternatives(0)

    alternative.getWordsList().each { word ->
        println(word.getSpeakerTag())  // !! Always outputs 0
    }
}

这种方法有什么问题吗？还是更改了语音转文本API，以便我们现在需要调用不同的方法来吸引演讲者？

Google的语音文本API不再返回频道标签/扬声器标签

0 个答案: