我正在尝试利用Google Speech-To-Text python客户端库。我的请求很好,但api为空。我正在使用从客户端发送的音频二进制数据。记录麦克风输入3秒钟,然后通过ajx请求发送。
我尝试过更改编码,将其更改为base64,但似乎没有任何方法可以提供成功的响应。
这是我的python代码
from flask import Flask, request, render_template
import io
import os
import sys
import json
import base64
# Imports the Google Cloud client library
from google.cloud import speech
from google.cloud.speech import enums
from google.cloud.speech import types
app = Flask(__name__)
@app.route('/audio', methods=['PUT'])
def audio():
client = speech.SpeechClient()
content = request.files['audio'].read()
# with open('voice.wav', 'wb') as file:
# file.write(content)
# with open('voice.wav', 'rb') as file:
# content = file.read();
audio = types.RecognitionAudio(content=base64.b64encode(content))
config = types.RecognitionConfig(
encoding=enums.RecognitionConfig.AudioEncoding.FLAC,
sample_rate_hertz=48000,
language_code='en-US')
# Detects speech in the audio file
response = client.recognize(config, audio)
print(response, file=sys.stderr)
for result in response.results:
print('Transcript: {}'.format(result.alternatives[0].transcript), file=sys.stderr)
return json.dumps({'success':True}), 200, {'ContentType':'application/json'}
还有我的js代码
const recordAudio = () =>
new Promise(async resolve => {
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
const mediaRecorder = new MediaRecorder(stream);
const audioChunks = [];
mediaRecorder.addEventListener("dataavailable", event => {
audioChunks.push(event.data);
});
const start = () => mediaRecorder.start();
const stop = () =>
new Promise(resolve => {
mediaRecorder.addEventListener("stop", () => {
const audioBlob = new Blob(audioChunks);
const audioUrl = URL.createObjectURL(audioBlob);
const audio = new Audio(audioUrl);
const play = () => audio.play();
resolve({ audioBlob, audioUrl, play });
});
mediaRecorder.stop();
});
resolve({ start, stop });
});
const sendAudio = (audioBlob) =>
new Promise(async resolve => {
var formData = new FormData();
formData.append('audio', audioBlob, 'audio')
$.ajax({
type: 'PUT',
url: '/audio',
data: formData,
processData: false,
contentType: false
}).done(function(data) {
console.log(data);
});
})
const sleep = time => new Promise(resolve => setTimeout(resolve, time));
const handleAction = async () => {
const recorder = await recordAudio();
const actionButton = document.getElementById('action');
actionButton.disabled = true;
recorder.start();
await sleep(3000);
const audio = await recorder.stop();
audio.play();
await sendAudio(audio.audioBlob)
await sleep(3000);
actionButton
和HTML
<!doctype html>
<html>
<head>
<title>Record Audio Test</title>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
</head>
<body>
<h1>Audio Recording Test</h1>
<p>Talk for 3 seconds, then you will hear your recording played back</p>
<script src="/static/index.js"></script>
<button id="action" onclick="handleAction()">Start recording...</button>
</body>
</html>
答案 0 :(得分:0)
当语音转文本返回空响应时,可能是音频未使用正确的编码。确保数据的音频编码(例如“ sample_rate_hertz”)与您在InitialRecognizeRequest中发送的参数匹配。
例如,如果您的请求指定了“ encoding”:“ FLAC”和“ sampleRateHertz”:16000,则SoX play命令列出的音频数据参数应该相同。
有关此here的更多信息。