无法使用split_on_silence()拆分音频

时间:2018-06-19 06:40:20

标签: python speech-recognition speech-to-text google-speech-api pydub

我尝试了min_silence_len和silence_thresh的多种组合,但它总是返回长度为1的块。无法理解,我缺少什么?

import speech_recognition as sr  
from pydub import AudioSegment
from pydub.silence import split_on_silence

# obtain audio from the microphone  
r = sr.Recognizer()
#r = sr.recognize_google(audio, key="GOOGLE_SPEECH_RECOGNITION_API_KEY")
with sr.Microphone() as source:
  print("Please wait. Calibrating microphone...")
  # listen for 5 seconds and create the ambient noise energy level
  r.adjust_for_ambient_noise(source, duration=5)
  print("Say something!")

audio = r.listen(source)
a = r.recognize_google(audio)
audio_chunks = split_on_silence(a, min_silence_len=30, silence_thresh=-5)
print(len(audio_chunks))

例如: 当我在麦克风上说“这段代码有什么问题”时。我期待完整的句子被分成块,因为audio_chunks应该是一个单词数组: 什么||是||错了||与||这个||代码

实际结果: audio_chunk长度= 1 audio_chunk = ['这个代码的内容是什么']

0 个答案:

没有答案