我是python和librosa的新手。我正在尝试将这种方法用于语音识别器:acoustic front end
我的代码:
import librosa
import librosa.display
import numpy as np
y, sr = librosa.load('test.wav', sr = None)
normalizedy = librosa.util.normalize(y)
stft = librosa.core.stft(normalizedy, n_fft = 256, hop_length=16)
mel = librosa.feature.melspectrogram(S=stft, n_mels=32)
melnormalized = librosa.util.normalize(mel)
mellog = np.log(melnormalized) - np.log(10**-5)
问题是,当我将librosa.util.normalize应用于变量mel时,我期望值在1到-1之间,但不是。我想念什么?
答案 0 :(得分:1)
如果您希望对输出进行对数缩放并将其标准化为-1和-1之间,则应先对数缩放,然后再进行标准化:
import librosa
import librosa.display
import numpy as np
y, sr = librosa.load('test.wav', sr = None)
normalizedy = librosa.util.normalize(y)
stft = librosa.core.stft(normalizedy, n_fft = 256, hop_length=16)
mel = librosa.feature.melspectrogram(S=stft, n_mels=32)
mellog = np.log(mel + 1e-9)
melnormalized = librosa.util.normalize(mellog)
# use melnormalized