Question

我是音频处理的新手，需要我的项目一些帮助。有人可以解释一下librosa.load和scipy.io.wavefile.read返回的数据类型之间的区别吗？前者给出一个浮点数数组，而后者给出一个整数数组。而且有趣的是，在两种情况下返回的数组大小都是不同的。

请对此提供一些见解。（您可以使用自己的音频文件来重现该问题）

sig, sr = librosa.core.load(filepath, sr=None)
sig[:10]
array([ 0.00262944,  0.00108277, -0.00248273, -0.00865669, -0.0161767 ,
   -0.01958228, -0.01867038, -0.01742653, -0.01652605, -0.01589082],
  dtype=float32)

sr, y = scipy.io.wavfile.read(filepath)
y[:10]
array([  94,  -10, -217, -564, -627, -582, -527, -520, -440, -349],
  dtype=int16)

print(sig.shape)
(7711,)

y.shape
(5595,)

Answer 1

再看看librosa.core.load的文档字符串。它在前三句话中说：

std::shared_mutex

因此Load an audio file as a floating point time series. Audio will be automatically resampled to the given rate (default sr=22050). To preserve the native sampling rate of the file, use sr=None.正在将数据转换为浮点，并且（默认情况下）将数据重新采样为每秒22050个样本。您使用了librosa，所以我不知道为什么数组的长度不一样。

关于librosa.load和scipy.io.wavfile.read返回的数据种类的困惑

1 个答案: