Question

我正在尝试使用Google语音在代码中输入文字。我有m3U8格式的视频和音频的实时流式传输。我正在使用FFMPEG从实时网址中提取音频。尝试将提取的音频发送到google api（不保存在磁盘上）以恢复转录。流是用块完成的。 API永远不会返回任何结果，也永远不会引发任何错误。有人可以告诉我为什么结果总是空白吗？注意：使用byte []将提取的音频发送到Google api。结果：API返回空白结果，没有任何错误消息。使用下面的代码调用RecognitionAudio FromBytes。

            outputStream = ffmpeg.StandardOutput.BaseStream;
            byte[] buffer = new byte[16 * 1024];
            using (MemoryStream ms = new MemoryStream())
            {
            int read;
            while ((read = outputStream.Read(buffer, 0, buffer.Length)) > 0)
            {
            ms.Write(buffer, 0, read);
            System.Environment.SetEnvironmentVariable("GOOGLE_APPLICATION_CREDENTIALS", "Demo.json");
            var speech = SpeechClient.Create();
            var longOperation = speech.Recognize(new RecognitionConfig()
            {
            Encoding = RecognitionConfig.Types.AudioEncoding.Linear16,
            EnableSeparateRecognitionPerChannel = true,
            SampleRateHertz = 16000,
            LanguageCode = "en",
            }, RecognitionAudio.FromBytes(ms.ToArray()));
            //    longOperation = longOperation.PollUntilCompleted();
            //  var response = longOperation.Results;
            foreach (var result in longOperation.Results)
            {
            foreach (var alternative in result.Alternatives)
            {
            Console.WriteLine(alternative.Transcript);
            }
            }
            }
            }

Answer 1

空白响应可能表示音频编码不正确。找到here进行故障排除。

语音到文本RecognitionAudio fromBytes始终返回空白结果

1 个答案: