我有一个声音信号,作为一个numpy数组导入,我想把它切成块状的numpy数组。但是,我希望块只包含超过阈值的元素。例如:
threshold = 3
signal = [1,2,6,7,8,1,1,2,5,6,7]
应该输出两个数组
vec1 = [6,7,8]
vec2 = [5,6,7]
好的,以上是列表,但你明白我的观点。
这是我到目前为止所尝试的内容,但这只会杀死我的内存
def slice_raw_audio(audio_signal, threshold=5000):
signal_slice, chunks = [], []
for idx in range(0, audio_signal.shape[0], 1000):
while audio_signal[idx] > threshold:
signal_slice.append(audio_signal[idx])
chunks.append(signal_slice)
return chunks
答案 0 :(得分:2)
这是一种方法 -
def split_above_threshold(signal, threshold):
mask = np.concatenate(([False], signal > threshold, [False] ))
idx = np.flatnonzero(mask[1:] != mask[:-1])
return [signal[idx[i]:idx[i+1]] for i in range(0,len(idx),2)]
示例运行 -
In [48]: threshold = 3
...: signal = np.array([1,1,7,1,2,6,7,8,1,1,2,5,6,7,2,8,7,2])
...:
In [49]: split_above_threshold(signal, threshold)
Out[49]: [array([7]), array([6, 7, 8]), array([5, 6, 7]), array([8, 7])]
其他方法 -
# @Psidom's soln
def arange_diff(signal, threshold):
above_th = signal > threshold
index, values = np.arange(signal.size)[above_th], signal[above_th]
return np.split(values, np.where(np.diff(index) > 1)[0]+1)
# @Kasramvd's soln
def split_diff_step(signal, threshold):
return np.split(signal, np.where(np.diff(signal > threshold))[0] + 1)[1::2]
计时 -
In [67]: signal = np.random.randint(0,9,(100000))
In [68]: threshold = 3
# @Kasramvd's soln
In [69]: %timeit split_diff_step(signal, threshold)
10 loops, best of 3: 39.8 ms per loop
# @Psidom's soln
In [70]: %timeit arange_diff(signal, threshold)
10 loops, best of 3: 20.5 ms per loop
In [71]: %timeit split_above_threshold(signal, threshold)
100 loops, best of 3: 8.22 ms per loop
答案 1 :(得分:2)
这是一种Numpythonic方法:
In [115]: np.split(signal, np.where(np.diff(signal > threshold))[0] + 1)
Out[115]: [array([1, 2]), array([6, 7, 8]), array([1, 1, 2]), array([5, 6, 7])]
请注意,这将为您提供基于分割逻辑(基于diff
和继续项目)的所有较低和较高项目,它们始终是交错的,这意味着您可以简单地将它们分开索引:
In [121]: signal = np.array([1,2,6,7,8,1,1,2,5,6,7])
In [122]: np.split(signal, np.where(np.diff(signal > threshold))[0] + 1)[::2]
Out[122]: [array([1, 2]), array([1, 1, 2])]
In [123]: np.split(signal, np.where(np.diff(signal > threshold))[0] + 1)[1::2]
Out[123]: [array([6, 7, 8]), array([5, 6, 7])]
您可以使用列表中第一项与threshold
的比较,以找出上述哪一项切片会为您提供上层项目。
通常,您可以使用以下代码段来获取上面的项目:
np.split(signal, np.where(np.diff(signal > threshold))[0] + 1)[signal[0] < threshold::2]
答案 2 :(得分:1)
这是一个选项:
above_th = signal > threshold
index, values = np.arange(signal.size)[above_th], signal[above_th]
np.split(values, np.where(np.diff(index) > 1)[0]+1)
# [array([6, 7, 8]), array([5, 6, 7])]
包装功能:
def above_thresholds(signal, threshold):
above_th = signal > threshold
index, values = np.arange(signal.size)[above_th], signal[above_th]
return np.split(values, np.where(np.diff(index) > 1)[0]+1)
above_thresholds(signal, threshold)
# [array([6, 7, 8]), array([5, 6, 7])]