Question

我正在使用狮身人面像将音频转换为文本，但是我找不到如何访问每个单词的置信度得分

我可以访问转录输出，但是无法获得模型背后的估计概率。这听起来很基础，但是我找不到合适的文档。我应该在下面添加什么？

test = sr.AudioFile(audio_file)
Recon = sr.Recognizer()

with test as source:
    test_audio = Recon.record(source)
text = Recon.recognize_sphinx(test_audio,language = 'en-US')```

Answer 1

当前版本的speech-recognition未返回

置信结果。如果您查看implementation：

def recognize_sphinx(...):
   ...
   # return results
   hypothesis = decoder.hyp()
   if hypothesis is not None: return hypothesis.hypstr
   raise UnknownValueError()  # no transcriptions available

您将看到仅返回文本结果（hypothesis.hypstr），而置信度为hypothesis.prob。一种快速的解决方法是在单独安装Pocketsphinx之后复制粘贴entire method：

pip安装Pocketsphinx

如何获得口袋狮身人面像转录的信心

1 个答案: