Question

我使用Google语音识别API。当我尝试识别持续时间在0.25-0.5秒之间的较短单词（例如“是”或“否”）时，Google API通常会返回NULL。我尝试了其他输入数据格式，并发布了here（16位PCM，单声道输入音频文件）解决方案，但它并没有改善响应。同时，对其他更长数据的识别也可以正常工作。

我试图通过在单词前后添加静音来人为地增加音频的持续时间，以使音频不少于5秒。无法识别的样本数量减少了4倍，但是在我看来，无法识别的样本数量仍然可以减少。

Google语音识别对短时单词的具体工作是什么？

我的代码：

credentials = service_account.Credentials.from_service_account_file(‘credentials’)

client = speech.SpeechClient(credentials=credentials)

# Loads the audio into memory
with io.open(nn, 'rb') as audio_file:
    content = audio_file.read()
    audio = types.RecognitionAudio(content=content)

config = types.RecognitionConfig(
    encoding='FLAC',
    language_code='ru-RU',
    sample_rate_hertz=16000,
    max_alternatives=maxAlternatives)

# Detects speech in the audio file
response = client.recognize(config, audio)

谢谢。

Google语音识别API：返回空结果

0 个答案: