我已经实现了谷歌的示例代码将音频转换为文本,音频托管在谷歌云中,并具有以下功能:格式:flac,采样率:16000,320kbps,频道:单声道,语言:西班牙语。我使用以下代码:
import sys
import os
import io
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = '/home/eparionad/Dropbox/Tesis/CredencialesApiGoogle/Tesis-59cc7659afbc.json'
speech_file = '/home/eparionad/Descargas/19-02-2018/JuninInformado/3-JuninInformado-19-02-18-13:51.flac'
def transcribe_gcs():
"""Asynchronously transcribes the audio file specified by the gcs_uri."""
from google.cloud import speech
from google.cloud.speech import enums
from google.cloud.speech import types
client = speech.SpeechClient()
gcs_uri = 'gs://audiosparareconocimiento/3-JuninInformado-19-02-18-13:51.flac'
audio = types.RecognitionAudio(uri=gcs_uri)
config = types.RecognitionConfig(
encoding=enums.RecognitionConfig.AudioEncoding.FLAC,
sample_rate_hertz=16000,
language_code='es-PE')
operation = client.long_running_recognize(config, audio)
print('Waiting for operation to complete...')
response = operation.result(timeout=90)
# Each result is for a consecutive portion of the audio. Iterate through
# them to get the transcripts for the entire audio file.
for result in response.results:
# The first alternative is the most likely one for this portion.
print('Transcript: {}'.format(result.alternatives[0].transcript))
print('Confidence: {}'.format(result.alternatives[0].confidence))
# [END def_transcribe_gcs]
transcribe_gcs()
答案 0 :(得分:-1)
尝试增加超时值以获得更多单词
而不是:
response = operation.result(timeout=90)
把:
response = operation.result(timeout=900)