Question

我已经管理了“概述教程”：https://cloud.google.com/speech/docs/getting-started 然后我尝试使用自己的音频文件。我上传了一个采样率为16000Hz的.flac文件。

我只使用托管在Google云端存储（sync-request.json）上的自己的音频文件更改了下面的gs://my-bucket/test4.flac文件

{
  "config": {
      "encoding":"flac",
      "sample_rate": 16000
  },
  "audio": {
      "uri":"gs://my-bucket/test4.flac"
  }
}

文件已被很好地识别，但请求返回“INVALID_ARGUMENT”错误

{
  "error": {
    "code": 400,
    "message": "Unable to recognize speech, code=-73541, possible error in recognition config. Please correct the config and retry the request.",
    "status": "INVALID_ARGUMENT"
  }
}

Answer 1

根据this回答，所有编码仅支持1个频道（单声道）音频

我用这个命令创建了FLAC文件：

ffmpeg -i test.mp3 test.flac

请求中的采样率与FLAC标头不匹配

但是将-ac 1（设置音频通道数量）添加到1可解决此问题。

ffmpeg -i test.mp3 -ac 1 test.flac

这是我的完整Node.js代码

const Speech = require('@google-cloud/speech');
const projectId = 'EnterProjectIdGeneratedByGoogle';

const speechClient = Speech({
    projectId: projectId
});

// The name of the audio file to transcribe
var fileName = '/home/user/Documents/test/test.flac';


// The audio file's encoding and sample rate
const options = {
    encoding: 'FLAC',
    sampleRate: 44100
};

// Detects speech in the audio file
speechClient.recognize(fileName, options)
    .then((results) => {
        const transcription = results[0];
        console.log(`Transcription: ${transcription}`);
    }, function(err) {
        console.log(err);
    });

采样率可以是16000或44100或其他有效值，编码可以是FLAC或LINEAR16。 Cloud Speech Docs

Answer 2

我的不好，因为文档＆＃34; https://cloud.google.com/speech/docs/basics＆＃34;，.flac文件必须是 16位PCM

<强> Sumup：

编码：FLAC
频道：1 @ 16位
采样率：16000Hz

/！\注意不要导出立体声文件（2个频道）文件，这会引发其他错误（只接受一个频道）Google speech API internal server error -83104

Google云语音同步“INVALID_ARGUMENT”

2 个答案: