我使用Google Speech API使用以下Python脚本https://github.com/GoogleCloudPlatform/python-docs-samples/blob/master/speech/cloud-client/transcribe_async.py和以下命令转录音频文件:
python transcribe_async.py 1503489730.193982.flac
我得到的回应就是这个:
Waiting for operation to complete...
Traceback (most recent call last):
File "transcribe_async.py", line 102, in <module>
transcribe_file(args.path)
File "transcribe_async.py", line 52, in transcribe_file
response = operation.result(timeout=200)
File "/home/toto/anaconda3/lib/python3.5/site-packages/google/gax/__init__.py", line 596, in result
raise GaxError(self._operation.error.message)
google.gax.errors.GaxError
我无法弄清楚错误是什么。我可能错误地配置了音频参数,我真的不知道。
由于
答案 0 :(得分:4)
Linear16是唯一可接受的异步格式。 Uncompressed 16-bit signed little-endian samples (Linear PCM). This is the only encoding that may be used by AsyncRecognize.
见documentation。
您可以将mp3转换为raw,如下所示:
sox async.mp3 -t raw --channels=1 --bits=16 --rate=16000 --encoding=signed-integer --endian=little async.raw