Question

我需要使用Librosa查找峰的能量，以便检测每个小节的第一拍。

我正在使用Librosa来检测单击轨道中的音频节拍。这很好，但是我现在希望检测每个小节的第一拍。我相信最好的方法是检测每个节拍的能量或音调。

目前，我正在将所有节拍记录到数组中。如何检测每个小节的第一拍？

def findPeaks(inputFile):
    print(">>> Finding peaks...\n")
    y, sr = librosa.load(inputFile)
    onset_env = librosa.onset.onset_strength(
        y=y, sr=sr, hop_length=512, aggregate=np.median
    )
    global inputTrackPeaks  # array of peaks
    inputTrackPeaks = librosa.util.peak_pick(onset_env, 3, 3, 3, 5, 0.5, 10)
    inputTrackPeaks = librosa.frames_to_time(inputTrackPeaks, sr=sr)
    inputTrackPeaks = inputTrackPeaks * 1000  # convert array to milliseconds
    print("Peak positions (ms): \n", inputTrackPeaks)

Answer 1

对于一个非常简单的节拍跟踪器，您可能希望使用librosa内置的beat tracking：

import librosa

y, sr = librosa.load(librosa.util.example_audio_file())
tempo, beats = librosa.beat.beat_track(y=y, sr=sr)
# beats now contains the beat *frame positions*
# convert to timestamps like this:
beat_times = librosa.frames_to_time(beats, sr=sr)

这给您节拍位置。但是您实际上一直在要求不合理的估算。找到能量最高的拍子的想法很好，但是您可能希望结合一些额外的知识并在相应的拍子中进行平均。例如，如果您知道音轨是4/4时间，则可以总结出每四个节拍的能量，然后得出结论，能量总和最高的节拍位置是次拍。

大致像这样：

import librosa
import numpy as np

y, sr = librosa.load('my file.wav')
# get onset envelope
onset_env = librosa.onset.onset_strength(y, sr=sr, aggregate=np.median)
# get tempo and beats
tempo, beats = librosa.beat.beat_track(onset_envelope=onset_env, sr=sr)
# we assume 4/4 time
meter = 4
# calculate number of full measures 
measures = (len(beats) // meter)
# get onset strengths for the known beat positions
# Note: this is somewhat naive, as the main strength may be *around*
#       rather than *on* the detected beat position. 
beat_strengths = onset_env[beats]
# make sure we only consider full measures
# and convert to 2d array with indices for measure and beatpos
measure_beat_strengths = beat_strengths[:measures * meter].reshape(-1, meter)
# add up strengths per beat position
beat_pos_strength = np.sum(measure_beat_strengths, axis=0)
# find the beat position with max strength
downbeat_pos = np.argmax(beat_pos_strength)
# convert the beat positions to the same 2d measure format
full_measure_beats = beats[:measures * meter].reshape(-1, meter)
# and select the beat position we want: downbeat_pos
downbeat_frames = full_measure_beats[:, downbeat_pos]
print('Downbeat frames: {}'.format(downbeat_frames))
# print times
downbeat_times = librosa.frames_to_time(downbeat_frames, sr=sr)
print('Downbeat times in s: {}'.format(downbeat_times))

使用这种代码的里程会有所不同。成功取决于音乐的种类，类型，计量器，节拍检测的质量等。这是因为它并非微不足道。实际上，低拍估计是当前Music Information Retrieval (MIR)的研究主题，尚未完全解决。要最近查看基于深度学习的高级自动下拍跟踪，您可能需要查看this article。

使用Librosa检测拍子能量，找到每个小节的第一个拍子

1 个答案: