当我训练模型时,损失在2500个时期内从0.9降低到0.5。正常吗
我的模特:
model = Sequential()
model.add(Embedding(vocab_size , emd_dim, weights=[emd_matrix], input_length=maxLen,trainable=False))
model.add(LSTM(256,return_sequences=True,activation="relu",kernel_regularizer=regularizers.l2(0.01),kernel_initializer=keras.initializers.glorot_normal(seed=None)))
model.add(LSTM(256,return_sequences=True,activation="relu",kernel_regularizer=regularizers.l2(0.01),kernel_initializer=keras.initializers.glorot_normal(seed=None)))
model.add(LSTM(256,return_sequences=False,activation="relu",kernel_regularizer=regularizers.l2(0.01),kernel_initializer=keras.initializers.glorot_normal(seed=None)))
model.add(Dense(l_h2i,activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer="adam", metrics=['accuracy'])
filepath = "F:/checkpoints/"+modelname+"/lstm-{epoch:02d}-{loss:0.3f}-{acc:0.3f}-{val_loss:0.3f}-{val_acc:0.3f}.hdf5"
checkpoint = ModelCheckpoint(filepath, monitor="loss", verbose=1, save_best_only=True, mode='min')
reduce_lr = ReduceLROnPlateau(monitor='loss', factor=0.5, patience=2, min_lr=0.000001)
print(model.summary())
history=model.fit(X_train_indices, Y_train_oh, batch_size=batch_size ,
epochs=epochs , validation_split=0.1, shuffle=True,
callbacks=[checkpoint, reduce_lr])
部分结果如下所示:
loss improved from 0.54275 to 0.54272
loss: 0.5427 - acc: 0.8524 - val_loss: 1.1198 - val_acc: 0.7610
loss improved from 0.54272 to 0.54268
loss: 0.5427 - acc: 0.8525 - val_loss: 1.1195 - val_acc: 0.7311
loss improved from 0.54268 to 0.54251
loss: 0.5425 - acc: 0.8519 - val_loss: 1.1218 - val_acc: 0.7420
loss improved from 0.54251 to 0.54249
loss: 0.5425 - acc: 0.8517 - val_loss: 1.1210 - val_acc: 0.7518
答案 0 :(得分:0)
请考虑像TensorFlow documentation中那样更新ReduceLROnPlateau参数。因子应该更大,忍耐应该更小
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2,
patience=5, min_lr=0.001)
model.fit(X_train, Y_train, callbacks=[reduce_lr])
参数:
- 监控器:要监控的数量。
- 因子:学习率降低的因子。 new_lr = lr *因子
- 耐心:没有改善的时期数,之后学习率将降低。
- verbose:整数。 0:安静,1:更新消息。
- 模式:{auto,min,max}之一。在最小模式下,当监视的数量停止减少时,lr将减小;在最大模式下 当监测数量停止增加时减少;在 自动模式,根据名称自动推断方向 监视数量。
min_delta:测量新的最佳阈值,仅关注重大变化。- cooldown:减少lr后恢复正常运行之前要等待的时期数。
- min_lr:学习率的下限。