Question

我正在尝试使用以下代码从音频文件（采样率：2000）中提取速度和节拍：

data, sr = librosa.load(path, mono=True, sr=2000)
print ("self.sr :", sr)
onset_env = librosa.onset.onset_strength(data, sr=sr)
tempo, beats = librosa.beat.beat_track(data, sr=sr, onset_envelope=onset_env)
print ("tempo :", tempo)
beats = librosa.frames_to_time(beats, self.sr)
print ("beats :", beats)

我仅更改了采样率

但是。输出很奇怪

/usr/lib/python3.6/site-packages/librosa/filters.py:284: UserWarning: Empty filters detected in mel frequency basis. Some channels will produce empty responses. Try increasing your sampling rate (and fmax) or reducing n_mels.
  warnings.warn('Empty filters detected in mel frequency basis. '
/usr/lib64/python3.6/site-packages/scipy/fftpack/basic.py:160: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  z[index] = x
tempo : 117.1875
beats : [  4   6   8  10  12  14  16  18  20  22  24  26  28  30  32  34  36  38
  40  42  44  46  48  50  52  54  56  58  60  62  64  66  68  70  72  74
  76  78  80  82  84  86  88  90  92  94  96  98 100 102 104 106 108 110
 112 114 116 118 120 122 124 126 128 130 132 134 136 138 140 142 144 146
 148 150 152 154 156 158 160 162 164 166 168 170 172 174 176 178 180 182
 184 186 188 190 192 194 196 198 200 202 204 206 208 210 212 214 216 218
 220 222 224 226 228 230 232 234 236 238 240 242 244 246 248 250 252 254
 256 258 260 262 264 266 268 270 272 274 276 278 280 282 284 286 288 290
 292 294 296 298 300 302 304 306 308 310 312 314 316 318 320 322 324 326
 328 330 332 334 336 338 340 342 344 346 348 350 352 354 356 358 360 362
 364 366 368 370 372 374 376 378 380 382 384 386 388 390 392 394 396 398
 400 402 404 406 408 410 412 414 416 418 420 422 424 426 428 430 432 434
 436 438 440 442 444 446 448 450 452 454 456 458 460 462 464 466]

所以，我删除了sr参数并运行以下代码：

data, sr = librosa.load(path, mono=True)
print ("self.sr :", sr)
onset_env = librosa.onset.onset_strength(data, sr=sr)
tempo, beats = librosa.beat.beat_track(data, sr=sr, onset_envelope=onset_env)
print ("tempo :", tempo)
beats = librosa.frames_to_time(beats, self.sr)
print ("beats :", beats)

此处已删除sr输出

self.sr : 22050
/usr/lib64/python3.6/site-packages/scipy/fftpack/basic.py:160: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  z[index] = x
tempo : 161.4990234375
beats : [   7   23   39   55   71   87  102  118  134  150  166  182  197  213
  228  244  260  276  292  307  323  339  355  371  387  404  420  438
  454  470  486  501  517  533  549  565  581  596  612  628  644  659
  675  691  706  722  738  754  770  786  801  817  833  850  868  884
  900  916  932  948  964  980  996 1011 1027 1043 1059 1074 1090 1106
 1121 1137 1153 1168 1184 1201 1216 1232 1248 1264 1279 1293 1312 1331
 1347 1363 1379 1394 1410 1426 1442 1458 1474 1489 1505 1520 1536 1552
 1568 1584 1599 1615 1631 1647 1663 1679 1696 1712 1730 1746 1762 1778
 1793 1809 1825 1841 1857 1873 1888 1904 1920 1936 1951 1967 1983 1998
 2014 2030 2046 2062 2078 2093 2109 2125 2142 2160 2176 2192 2208 2224
 2240 2256 2272 2288 2303 2319 2335 2351 2366 2382 2398 2413 2429 2445
 2460 2476 2492 2508 2524 2540 2556 2571 2585 2604 2623 2639 2655 2671
 2686 2702 2718 2734 2750 2766 2781 2797 2812 2828 2844 2860 2876 2891
 2907 2923 2939 2955 2971 2988 3004 3022 3038 3054 3070 3085 3101 3117
 3133 3149 3165 3180 3196 3212 3228 3243 3259 3275 3290 3306 3322 3338
 3354 3370 3385 3401 3417 3434 3452 3468 3484 3500 3516 3532 3548 3564
 3580 3595 3611 3627 3643 3658 3674 3690 3705 3721 3737 3752 3768 3784
 3800 3816 3832 3848 3863 3877 3896 3915 3931 3947 3963 3978 3994 4010
 4026 4042 4058 4073 4089 4104 4120 4136 4152 4168 4183 4199 4215 4231
 4247 4263 4280 4296 4314 4330 4346 4362 4377 4393 4409 4425 4441 4457
 4472 4488 4504 4520 4535 4551 4567 4582 4598 4614 4630 4646 4662 4677
 4693 4709 4726 4744 4760 4776 4792 4808 4824 4840 4856 4872 4887 4903
 4919 4935 4950 4966 4982 4997 5013 5029 5044 5060 5076 5092 5108 5124]

更换sr后如何正常工作？

谢谢

Answer 1

打电话时

data, sr = librosa.load(path, mono=True, sr=2000)

您正在要求librosa对您在2000 Hz的输入进行重新采样（请参阅docs：“目标采样率”）。 2000 Hz是音乐的非常不正常的采样频率，恕我直言，librosa中的许多算法可能无法正常使用。取而代之的是，典型频率为44.1 kHz（CD品质）或22050 Hz（librosa默认值）。

我认为节拍跟踪器正在尝试将您的数据分成mel bands，然后分别处理这些频段，也许带有一些新颖的曲线或起始信号功能，但是2 kHz的工作量并不是很多，这可能就是您看到空过滤器消息的原因。但是，如果结果（对于sr=2000是正确的，则可以忽略该警告。

但是，对我来说，不设置sr似乎是一个更安全的选择，让librosa将您的音频（无论其大小）重新采样为22050 Hz，然后在其上运行拍跟踪算法。 22050 Hz是最有可能在测试中开发的采样率，并且很有可能成功。

关于：

/usr/lib64/python3.6/site-packages/scipy/fftpack/basic.py:160：FutureWarning：不建议将非元组序列用于多维索引；使用arr[tuple(seq)]而不是arr[seq]。将来，它将被解释为数组索引arr[np.array(seq)]，它将导致错误或不同的结果。

这似乎是与librosa如何实施某些事情有关的警告。您应该可以无视它而忽略它。

在librosa.load上更改采样率时，如何更改librosa.onset.onset_strength？

1 个答案: