我正在尝试使用SpeechRecognition
转录大约3分钟的音频文件,但是,似乎无法转录超过20秒的任何内容。这是我正在使用的代码:
r = sr.Recognizer()
audio = FLAC(output_name +'.' + output_format)
audio_length = audio.info.length
file = sr.AudioFile(output_name +'.' + output_format)
with file as source:
audio = r.record(source, duration = 20)
google = r.recognize_google(audio, language = 'ru-RU' )
print(google)
我该如何循环播放,使其转录0s-20s,然后转录20s-40s,依此类推,直到音频文件结束?
我想避免将文件尽可能地分成20s长度的单独文件。
答案 0 :(得分:1)
所以我知道了。我的缺点是没有足够仔细地阅读SpeechRecognition模块的文档,但是它们有一个offset
参数!
count = 0
for audio_path in audio_files:
audio = FLAC(audio_list[count] + '.' + output_format) #specify audio file for length calculation
audio_length = audio.info.length #get length of audio file
#n.b. mutagen module used for calculating audio length
number_of_iterations = int(audio_length/20)
if number_of_iterations == 0:
number_of_iterations = 1
file = sr.AudioFile(audio_list[count] + '.' + output_format)
for i in range(number_of_iterations):
with file as source:
audio = r.record(source, offset = i*20, duration = 20)
google = r.recognize_google(audio, language = 'ru-RU' )
count = count + 1
print(google)