我正在尝试使用keras测试cnn的训练模型,但是当我运行代码时,出现错误:
预期输入具有4个维度,但数组的形状为(32, 549,1)。
(32,549,1)是我用来训练和测试cnn效果良好的对数频谱图的大小。除了最后一个错误。
我尝试使用np.rezise(-1,amp)和y =(-1,amp)来尝试增加我的向量,但是它不起作用,我真的不知道该怎么办。
onDestroy()
所有其余代码都可以正常工作,但是只有最后一部分向我展示了错误,预期输入具有4维,但是数组的形状为(32,549,1)。
完全错误:
DIR = 'C:/Users/ROBERTO VILCHEZ/Desktop/Redes/TRAIN/ayuda/ayuda_1.wav' SAMPLE_RATE = 88200 model=load_model('C:/Users/ROBERTO VILCHEZ/Desktop/Redes/mi_modelo.h5') def read_wav_file(x): _, wav = wavfile.read(x) # Normalize wav = wav.astype(np.float32) / np.iinfo(np.int16).max return wav def log_spectrogram(wav): freqs, times, spec = stft(wav, SAMPLE_RATE, nperseg = 400, noverlap = 240, nfft = 512, padded = False, boundary = None) # Log spectrogram amp = np.log(np.abs(spec)+1e-10) return freqs, times, amp threshold_freq=5500 eps=1e-10 x=DIR wav = read_wav_file(x) L = 88200 if len(wav) > L: i = np.random.randint(0, len(wav) - L) wav = wav[i:(i+L)] elif len(wav) < L: rem_len = L - len(wav) silence_part = np.random.randint(-100,100,88200).astype(np.float32) / np.iinfo(np.int16).max j = np.random.randint(0, rem_len) silence_part_left = silence_part[0:j] silence_part_right = silence_part[j:rem_len] wav = np.concatenate([silence_part_left, wav, silence_part_right]) freqs, times, spec = stft(wav, L, nperseg = 400, noverlap = 240, nfft = 512, padded = False, boundary = None) if threshold_freq is not None: spec = spec[freqs <= threshold_freq,:] freqs = freqs[freqs <= threshold_freq] amp = np.log(np.abs(spec)+eps) y = np.expand_dims(amp, axis=3) res = model.predict(y)
ValueError:检查输入时出错:预期输入为4 尺寸,但数组的形状为(32,549,1)
答案 0 :(得分:1)
如果只想预测一个输入,则需要将测试数据扩展为(Batch_size,..,..,..)。
因此,如果您的y的形状为(32,549,1),请执行简单的操作:
y = np.expand_dims(y, axis=0) # y shape = (1, 32, 549, 1)
然后运行您的预测。