使音频文件中的块与python中的重叠

时间:2019-01-24 08:02:09

标签: python audio chunks segment

我想从音频文件中制作块,以便在块之间重叠。例如,如果每个块的长度为4秒,并且第一个块的长度为0到4,并且重叠的步长为1秒,那么第二个块的长度应该为3到7。根据这个How to splice an audio file (wav format) into 1 sec splices in python? ,我使用了{{1} }模块和pydub方法,但是在块之间没有重叠,只是将音频文件切成固定长度的块。有人为此目的有主意吗?谢谢

1 个答案:

答案 0 :(得分:1)

这是一种方法:

import numpy as np
from scipy.io import wavfile

frequency, signal = wavfile.read(path)

slice_length = 4 # in seconds
overlap = 1 # in seconds
slices = np.arange(0, len(signal), slice_length-overlap, dtype=np.int)

for start, end in zip(slices[:-1], slices[1:]):
    start_audio = start * frequency
    end_audio = end * frequency
    audio_slice = audio[start_audio: end_audio]

本质上,我们执行以下操作:

  1. 加载文件及其对应的频率。出于示例的考虑,我假设它具有单通道,而具有多通道,则可以使用相同的方式工作,只是需要更多的代码。
  2. 定义所需的切片长度和重叠。该阵列将为我们提供每个音频片段的开始。通过进一步压缩并添加重叠部分,我们可以获得所需的块。

要使自己确信切片有效,请查看以下代码段:

slice_length = 4 # in seconds
overlap = 1 # in seconds
slices = np.arange(0, 26, slice_length-overlap, dtype=np.int) # 26 is arbitrary

frequency = 1
for start, end in zip(slices[:-1], slices[1:]):
    start_audio = start * frequency
    end_audio = end * frequency + overlap
    print(start_audio, end_audio)

输出:

0 4
3 7
6 10
9 13
12 16
15 19
18 22
21 25