我正在研究语音情感识别,因此会为每个帧计算MFCC及其增量和双增量特征。 (13 + 13 + 13 = 39) 计算频谱子带质心,每帧26个 计算所有帧中每个音频文件的每个系数的均值,方差,最大值,最小值,偏度,峰度和四分位间距 特征向量的维数(13 + 13 + 13 + 26)* 7 = 455 我将数据集分为train(0.8)和测试集(0.2)
model = Sequential()
model.add(Conv1D(128, 5,padding='same', batch_input_shape=(None, 40, 1) ))
model.add(Activation('relu'))
model.add(Dropout(0.1))
model.add(MaxPooling1D(pool_size=(8)))
model.add(Conv1D(128, 5,padding='same',))
model.add(Activation('relu'))
model.add(Dropout(0.1))
model.add(Flatten())
model.add(Dense(8))
model.add(Activation('softmax'))
opt = keras.optimizers.rmsprop(lr=0.00005, rho=0.9, epsilon=None, decay=0.0)
model.summary()
这是我的代码,
x_traincnn = np.expand_dims(X_train, axis=2)
x_testcnn = np.expand_dims(X_test, axis=2)
输出为((960,455,1),(240,455,1))
the error is ValueError Traceback (most recent call last)
<ipython-input-58-c6d1c71bdd25> in <module>()
----> 1 model.add(Conv1D(128, 5,padding='same', batch_input_shape=(40, 1) ))
2 model.add(Activation('relu'))
3 model.add(Dropout(0.1))
4 model.add(MaxPooling1D(pool_size=(8)))
5 model.add(Conv1D(128, 5,padding='same',))
2 frames
/usr/local/lib/python3.6/dist-packages/keras/engine/base_layer.py in assert_input_compatibility(self, inputs)
303 self.name + ': expected ndim=' +
304 str(spec.ndim) + ', found ndim=' +
--> 305 str(K.ndim(x)))
306 if spec.max_ndim is not None:
307 ndim = K.ndim(x)
ValueError: Input 0 is incompatible with layer conv1d_34: expected ndim=3, found ndim=2