我正在运行回归类型的卷积神经网络。该网络获取55x1756的图像,并输出尺寸为11x1756的另一个图像。因此,我的体系结构的最后一层(如下所示)由一个密集层组成,该层具有作为参数的输出尺寸相乘在一起。
如下所示,我正在使用“ tanh”激活功能和“ adam”作为优化程序。我已经培训网络一段时间了,但是结果几乎总是一样。除了验证损失低于不理想的训练损失外,损失还保持稳定以及均方根误差。下面随附的是训练原理图和模型摘要。
您对我的改进方法有何建议? 预先感谢!
def generator(data_arr, batch_size = 10):
num = len(data_arr)
num = int(num/batch_size)
# Loop forever so the generator never terminates
while True:
for offset in range(0, num):
batch_samples = (data_arr[offset*batch_size:(offset+1)*batch_size])
samples = []
labels = []
for batch_sample in batch_samples:
samples.append(batch_sample[0])
labels.append((np.array(batch_sample[1].flatten())).transpose())
X_ = np.array(samples)
Y_ = np.array(labels)
X_ = X_[:, :, :, newaxis]
yield (X_, Y_)
# compile and train the model using the generator function
train_generator = generator(training_data, batch_size = 10)
validation_generator = generator(val_data, batch_size = 10)
model = Sequential()
model.add(Conv2D(4, (2, 2), input_shape = (55, 1756, 1)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size = (3, 3)))
model.add(BatchNormalization())
model.add(Conv2D(8, (2, 2)))
model.add(Activation('tanh'))
model.add(MaxPooling2D(pool_size = (3, 3)))
model.add(BatchNormalization())
model.add(Conv2D(16, (2, 2)))
model.add(Activation('tanh'))
model.add(MaxPooling2D(pool_size = (3, 3)))
model.add(BatchNormalization())
model.add(Flatten())
model.add(Dense(19316))
model.add(Activation('softmax'))
def nrmse(y_true, y_pred):
return backend.sqrt(backend.mean(backend.square(y_pred -
y_true)))/(2)
def rmse(y_true, y_pred):
return backend.sqrt(backend.mean(backend.square(y_pred - y_true)))
model.compile(loss = 'mean_squared_error',
optimizer = 'adam',
metrics = [rmse, nrmse, 'mae'])
model.summary()
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 27, 878, 4) 20
_________________________________________________________________
activation_1 (Activation) (None, 27, 878, 4) 0
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 9, 292, 4) 0
_________________________________________________________________
batch_normalization_1 (Batch (None, 9, 292, 4) 16
_________________________________________________________________
conv2d_2 (Conv2D) (None, 8, 291, 8) 136
_________________________________________________________________
activation_2 (Activation) (None, 8, 291, 8) 0
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 2, 97, 8) 0
_________________________________________________________________
batch_normalization_2 (Batch (None, 2, 97, 8) 32
_________________________________________________________________
flatten_1 (Flatten) (None, 1552) 0
_________________________________________________________________
dense_1 (Dense) (None, 19316) 29997748
_________________________________________________________________
activation_3 (Activation) (None, 19316) 0
=================================================================
Total params: 29,997,952
Trainable params: 29,997,928
Non-trainable params: 24
_________________________________________________________________
Epoch 1/6
6660/6660 [==============================] - 425s 64ms/step - loss: 0.0135 - rmse: 0.0986 - nrmse: 0.0577 - mean_absolute_error: 0.0333 - val_loss: 0.0133 - val_rmse: 0.0971 - val_nrmse: 0.0572 - val_mean_absolute_error: 0.0327
Epoch 2/6
6660/6660 [==============================] - 422s 63ms/step - loss: 0.0135 - rmse: 0.0986 - nrmse: 0.0577 - mean_absolute_error: 0.0332 - val_loss: 0.0133 - val_rmse: 0.0971 - val_nrmse: 0.0572 - val_mean_absolute_error: 0.0327
Epoch 3/6
6660/6660 [==============================] - 422s 63ms/step - loss: 0.0135 - rmse: 0.0986 - nrmse: 0.0577 - mean_absolute_error: 0.0332 - val_loss: 0.0133 - val_rmse: 0.0971 - val_nrmse: 0.0572 - val_mean_absolute_error: 0.0327
Epoch 4/6
6660/6660 [==============================] - 422s 63ms/step - loss: 0.0135 - rmse: 0.0986 - nrmse: 0.0577 - mean_absolute_error: 0.0332 - val_loss: 0.0133 - val_rmse: 0.0971 - val_nrmse: 0.0572 - val_mean_absolute_error: 0.0327
Epoch 5/6
6660/6660 [==============================] - 422s 63ms/step - loss: 0.0135 - rmse: 0.0986 - nrmse: 0.0577 - mean_absolute_error: 0.0332 - val_loss: 0.0133 - val_rmse: 0.0971 - val_nrmse: 0.0572 - val_mean_absolute_error: 0.0327
Epoch 6/6
6660/6660 [==============================] - 421s 63ms/step - loss: 0.0135 - rmse: 0.0986 - nrmse: 0.0577 - mean_absolute_error: 0.0332 - val_loss: 0.0133 - val_rmse: 0.0971 - val_nrmse: 0.0572 - val_mean_absolute_error: 0.03274
答案 0 :(得分:1)
如果您使用ReLu以外的激活功能,则可能会消失梯度问题。尝试将功能更改为ReLu,然后查看它是否有所改进。