Google Speech API无效参数"音频内容"太长

时间:2017-07-24 22:37:16

标签: python audio pyaudio google-speech-api

尝试将一批音频数据发送到Google语音API进行转录时,我遇到了一些错误。有时它有效,有时它不起作用。如果它不起作用,我会收到以下形式的错误:

Traceback (most recent call last): File "/Users/mihaileric/Documents/Research/Ford Project/forddialogue/util/record_and_transcribe_audio.py", line 196, in <module> listen_for_speech() File "/Users/mihaileric/Documents/Research/Ford Project/forddialogue/util/record_and_transcribe_audio.py", line 163, in listen_for_speech transcribe_audio(filename) File "/Users/mihaileric/Documents/Research/Ford Project/forddialogue/util/record_and_transcribe_audio.py", line 57, in transcribe_audio for response in responses: File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 366, in next return self._next() File "/usr/local/lib/python2.7/site-packages/grpc/_channel.py", line 357, in _next raise self grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.INVALID_ARGUMENT, Invalid 'audio_content': too long.)>

最相关的代码块可能如下:

client = speech.SpeechClient()

# [START migration_streaming_request]
with io.open(audio_file, 'rb') as audio_file:
    content = audio_file.read()

# In practice, stream should be a generator yielding chunks of audio data.
stream = [content]
requests = (types.StreamingRecognizeRequest(audio_content=chunk)
            for chunk in stream)

config = types.RecognitionConfig(
    encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
    sample_rate_hertz=RATE,
    language_code='en-US')
streaming_config = types.StreamingRecognitionConfig(config=config)

# streaming_recognize returns a generator.
# [START migration_streaming_response]
responses = client.streaming_recognize(streaming_config, requests)
print "Responses: ", responses
for response in responses: <-- FAILS HERE
    for result in response.results:
        print('Finished: {}'.format(result.is_final))
        print('Stability: {}'.format(result.stability))
        alternatives = result.alternatives
        for alternative in alternatives:
            print('Confidence: {}'.format(alternative.confidence))
            print('Transcript: {}'.format(alternative.transcript))

`

令我困惑的是我发送的音频数据不是很长,从不超过~15秒。我在单声道音频上使用16000的采样率,写入的文件是&#34; .wav&#34;。这对于在Google上找到可能的解决方案来说也是一个很难找到的解决方案,因为看起来其他人并没有遇到这个问题。对于可能的错误来源我应该注意的任何线索,我将不胜感激。谢谢!

0 个答案:

没有答案