将音频块队列传递给Google的异步转录选项

时间:2019-05-20 21:21:47

标签: python asynchronous queue multiprocessing google-speech-api

我正在尝试将使用chunks.get(in_data)从PyAudio的回调函数获得的音频块传递给Google Speech's asynchronous transcribe

此外,我正在使用Python的multiprocessing模块以单个工人使用Threadpool来逐个处理这些块:

pool = ThreadPool(processes=1, initializer=initGoogleCloud, initargs=(audio_rate, credentials_json, lang_code, asr_narrowband, preferred_phrases, show_all))  
async_result = pool.apply_async(GoogleCloud, (self.detect_chunk_buffer.get()))
return_text = async_result.get()
def initGoogleCloud(SAMPLERATE, credentials_json, lang_code, is_narrowband, preferred_phrases, show_all):
    assert isinstance(lang_code, str), "lang_code must be a string."
    try:
        from google.cloud import speech
        from google.cloud.speech import enums
        from google.cloud.speech import types
        from google.oauth2 import service_account
    except ImportError:
        print('google.cloud failed to import.')

    if is_narrowband is True:
        use_enhanced = True
        model = 'phone_call'
    else:
        use_enhanced = False
        model = 'default'

    # Configurations for Google Cloud
    with open('tmp_credentials.json', 'w') as fp:
        json.dump(credentials_json, fp)
    google_credentials = service_account.Credentials.from_service_account_file('tmp_credentials.json')

    client = speech.SpeechClient(credentials=google_credentials)
    config = types.RecognitionConfig(
        encoding=enums.RecognitionConfig.AudioEncoding.LINEAR16,
        sample_rate_hertz=SAMPLERATE,
        language_code=lang_code,
        use_enhanced=use_enhanced,
        model=model)
    streaming_config = types.StreamingRecognitionConfig(config=config, interim_results=True)

def GoogleCloud(audio_chunk):
    byte_chunk = b''.join(audio_chunk)
    audio = types.RecognitionAudio(byte_chunk)
    operation = client.long_running_recognize(config, audio)

    #Waiting for operation to complete...
    response = operation.result(timeout=90)

    # Processing response
    return listen_print_loop(responses)

输出: TypeError:GoogleCloud()接受1个位置参数,但给出了2048 中止陷阱:6

好像chunk.get()提取所有音频序列作为参数。有没有办法传递队列中的单个块进行处理?

我的PyAudio格式为pyaudio.paInt16

1 个答案:

答案 0 :(得分:0)

为了将音频块“打包”成一个参数,我修改了 async_result = pool.apply_async(GoogleCloud, (self.detect_chunk_buffer.get()))

audio_chunk = [self.detect_chunk_buffer.get()]会将其打包到列表中,然后再作为async_result = pool.apply_async(rttASR.GoogleCloud, args=(audio_chunk))的自变量发送。

它工作正常,似乎我的self.detect_chunk_buffer.get()(包含来自in_data的PyAudio回调的paInt16音频块)不需要任何额外的base64编码。