如何将WAV音频流发送到diaglogflow

时间:2019-07-05 11:21:05

标签: python dialogflow

我有一个json,其字段包含Aduio(WAV)。现在,我想将其发送到dialogflow。

如代码中所示,在解码数据之后,我想直接将其发送到dialogflow,而不是将其保存在文件中,然后将其传递给dialogflow。


# instead of audio_file_path int the below method, i want to pass a variable that contains audio data

def detect_intent_stream(project_id, session_id, audio_file_path,
                         language_code):   
    session_client = dialogflow.SessionsClient()

    audio_encoding = dialogflow.enums.AudioEncoding.AUDIO_ENCODING_LINEAR_16
    sample_rate_hertz = 16000

    session_path = session_client.session_path(project_id, session_id)
    print('Session path: {}\n'.format(session_path))

    def request_generator(audio_config, audio_file_path):
        query_input = dialogflow.types.QueryInput(audio_config=audio_config)

        # The first request contains the configuration.
        yield dialogflow.types.StreamingDetectIntentRequest(
            session=session_path, query_input=query_input)

        # Here we are reading small chunks of audio data from a local
        # audio file.  In practice these chunks should come from
        # an audio input device.
        with open(audio_file_path, 'rb') as audio_file:
            while True:
                chunk = audio_file.read(4096)
                if not chunk:
                    break
                # The later requests contains audio data.
                yield dialogflow.types.StreamingDetectIntentRequest(
                    input_audio=chunk)

    audio_config = dialogflow.types.InputAudioConfig(
        audio_encoding=audio_encoding, language_code=language_code,
        sample_rate_hertz=sample_rate_hertz)

    requests = request_generator(audio_config, audio_file_path)
    responses = session_client.streaming_detect_intent(requests)

    print('=' * 20)
    for response in responses:
        print('Intermediate transcript: "{}".'.format(
                response.recognition_result.transcript))

    # Note: The result from the last response is the final transcript along
    # with the detected content.
    query_result = response.query_result

    print('Fulfillment text: {}\n'.format(query_result.fulfillment_text))

# ----------------------------------------------------

data = request.json["data"]  # this contain Audio(WAV) in base64 format

decoded = base64.b64decode(audio)  # decoding base64

# I want this like I am passing "decoded" variable that contains audio data in WAV format instead of audio file path
detect_intent_stream('my_project_id','my_session_id', decoded,'language_code') 

f = open('new.wav', 'wb').write(decoded) # writing the decoded data into file

1 个答案:

答案 0 :(得分:0)

首先,确保所接收的WAV数据格式正确,并且与采样率(通常为44100)和正确数量的音频通道相匹配(Dialogflow仅使用单声道或单通道音频)。他们在提供的代码中设置了audio_encoding变量和输入音频配置,对它们进行了研究。

接下来,您有一串WAV数据,但是StreamingDetectIntentRequest接受字节,因此,我将其转换为bytearray

而不是将base64转换为字符串。

接下来,与其从音频文件中产生音频块,不如从字节数组中产生块。

    def request_generator(audio_config, byte_array):
        query_input = dialogflow.types.QueryInput(audio_config=audio_config)

        # The first request contains the configuration.
        yield dialogflow.types.StreamingDetectIntentRequest(
            session=session_path, query_input=query_input) 

        for chunk in range(44, len(byte_array), 4096): # Start at position 44 to discard the wav header
            yield  dialogflow.types.StreamingDetectIntentRequest(
                input_audio=bytes(chunk))