我如何转录大文件,以避免使用Google Speech API异步转录错误Operation not complete and retry limit reached.
来处理大型音频文件?
在python中这样做是否可行?或者我应该将文件分解为较小的文件并重试?
我做了多少
ffmpeg -i 2017-06-13-17_48_51.flac -ac 1 mono.flac
我选择了ffmpeg,因为我使用sox
得到了这个错误
sox 2017-06-13-17_48_51.flac --channels=1 --bits=16 2017-06-13-17_48_51_more_stable.flac
袜子WARN抖动:抖动剪裁55个样本;减少量?
Input File : '2017-06-13-17_48_51.flac'
Channels : 2
Sample Rate : 48000
Precision : 16-bit
Duration : 00:21:18.40 = 61363200 samples ~ 95880 CDDA sectors
File Size : 60.7M
Bit Rate : 380k
Sample Encoding: 16-bit FLAC
ffmpeg -i 2017-06-13-17_48_51.flac -ac 1 mono.flac
Input File : 'mono.flac'
Channels : 1
Sample Rate : 48000
Precision : 16-bit
Duration : 00:21:18.40 = 61363200 samples ~ 95880 CDDA sectors
File Size : 59.9M
Bit Rate : 375k
Sample Encoding: 16-bit FLAC
Comment : 'encoder=Lavf56.40.101'
Google Speech API Asynchronous Ex。 w / Explicit Credentials
我将Flac Hertz改为“48000”并放入一个明确的环境路径
import argparse import io import time import os os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "cloud_speech_service_keys.json" def transcribe_file(speech_file): """Transcribe the given audio file asynchronously.""" from google.cloud import speech speech_client = speech.Client() with io.open(speech_file, 'rb') as audio_file: content = audio_file.read() audio_sample = speech_client.sample( content, source_uri=None, encoding='LINEAR16', sample_rate_hertz=16000) operation = audio_sample.long_running_recognize('en-US') retry_count = 100 while retry_count > 0 and not operation.complete: retry_count -= 1 time.sleep(2) operation.poll() if not operation.complete: print('Operation not complete and retry limit reached.') return alternatives = operation.results for alternative in alternatives: print('Transcript: {}'.format(alternative.transcript)) print('Confidence: {}'.format(alternative.confidence)) # [END send_request] def transcribe_gcs(gcs_uri): """Asynchronously transcribes the audio file specified by the gcs_uri.""" from google.cloud import speech speech_client = speech.Client() audio_sample = speech_client.sample( content=None, source_uri=gcs_uri, encoding='FLAC', sample_rate_hertz=48000) operation = audio_sample.long_running_recognize('en-US') retry_count = 100 while retry_count > 0 and not operation.complete: retry_count -= 1 time.sleep(2) operation.poll() if not operation.complete: print('Operation not complete and retry limit reached.') return alternatives = operation.results for alternative in alternatives: print('Transcript: {}'.format(alternative.transcript)) print('Confidence: {}'.format(alternative.confidence)) # [END send_request_gcs] if __name__ == '__main__': parser = argparse.ArgumentParser( description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter) parser.add_argument( 'path', help='File or GCS path for audio file to be recognized') args = parser.parse_args() if args.path.startswith('gs://'): transcribe_gcs(args.path) else: transcribe_file(args.path)