我很难实现语音到文本的长时间识别,尤其是使用自动标点符号时。 我知道它仍处于测试阶段。简短的音频文件没问题,但是当尝试转录较长的文件(已经在存储桶中并通过gcs_uri访问)时,它总是最终输出以下错误:“ google.api_core.exceptions.InvalidArgument:400同步输入太长。对于超过1分钟的音频,请使用带有“ uri”参数的LongRunningRecognize”。
enable_automatic_punctuation可用于存储桶中少于1分钟的音频文件。
from google.cloud import speech_v1p1beta1 as speech
from google.cloud.speech_v1p1beta1 import enums
from google.cloud.speech_v1p1beta1 import types
gcs_uri = 'gs://bucket/audiofile.wav'
client = speech.SpeechClient.from_service_account_json('/service_account.json')
audio = types.RecognitionAudio(uri=gcs_uri)
config = types.RecognitionConfig(
encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
language_code='de-DE',
enable_automatic_punctuation=True)
operation = client.long_running_recognize(config, audio)
print('Waiting for operation to complete...')
response = operation.result(timeout=9000)
response = client.recognize(config, audio)
for i, result in enumerate(response.results):
alternative = result.alternatives[0]
print('-' * 20)
print('First alternative of result {}'.format(i))
print(u'Transcript: {}'.format(alternative.transcript))
“ google.api_core.exceptions.InvalidArgument:400同步输入太长。对于超过1分钟的音频,请使用带有“ uri”参数的LongRunningRecognize”,即使音频文件已经存储在存储桶中并可以从存储桶中访问