为什么我的损失和准确性图有些不稳定?

时间:2020-10-17 22:15:11

标签: python tensorflow keras

我建立了Bi-LSTM模型,该模型试图根据给定的单词预测某些类别。例如,“微笑”一词应由“友好”来预言。

但是,在训练后,该模型每10个类别有100个样本(总共1000个),在绘制准确性和损失时,这两个样本会连续不断地晃动。为什么会发生这种情况?样本数量增加会导致拟合不足。

型号

def build_model(vocab_size, embedding_dim=64, input_length=30):
    print('\nbuilding the model...\n')

    model = tf.keras.Sequential([
        tf.keras.layers.Embedding(input_dim=(vocab_size + 1), output_dim=embedding_dim, input_length=input_length),
        tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(rnn_units, return_sequences=True, dropout=0.2)),
        tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(rnn_units, return_sequences=True, dropout=0.2)),
        tf.keras.layers.GlobalMaxPool1D(),
        tf.keras.layers.Dropout(0.1),
        tf.keras.layers.Dense(64, activation='tanh', kernel_regularizer=tf.keras.regularizers.L2(l2=0.01)),
        
        # softmax output layer
        tf.keras.layers.Dense(10, activation='softmax')
    ])

    # optimizer & loss
    opt = 'RMSprop' #tf.optimizers.Adam(learning_rate=1e-4)
    loss = 'categorical_crossentropy'

    # Metrics
    metrics = ['accuracy', 'AUC','Precision', 'Recall']

    # compile model
    model.compile(optimizer=opt, 
                  loss=loss,
                  metrics=metrics)
    
    model.summary()

    return model

培训

def train(model, x_train, y_train, x_validation, y_validation,
          epochs, batch_size=32, patience=5, 
          verbose=2, monitor_es='accuracy', mode_es='auto', restore=True,
          monitor_mc='val_accuracy', mode_mc='max'):
    
    # callback
    early_stopping = tf.keras.callbacks.EarlyStopping(monitor=monitor_es,
                                                      verbose=1, mode=mode_es, restore_best_weights=restore,
                                                      min_delta=1e-3, patience=patience)
    
    model_checkpoint = tf.keras.callbacks.ModelCheckpoint('tfjsmode.h5', monitor=monitor_mc, mode=mode_mc,      
                                                          verbose=1, save_best_only=True)

    keras_callbacks = [early_stopping, model_checkpoint]

    # train model
    history = model.fit(x_train, y_train,
                        batch_size=batch_size, epochs=epochs, verbose=verbose,
                        validation_data=(x_validation, y_validation),
                        callbacks=keras_callbacks)
    return history

准确性和损失

enter image description here

enter image description here

批量大小

当前批量大小设置为16,如果我将批量大小增加到64,每个类别有2500个样本,则最终图将导致拟合不足。

enter image description here enter image description here

1 个答案:

答案 0 :(得分:2)

如评论中指出的,批次大小越小,批次平均值的方差越大,然后损失的波动越大。我通常使用80的批处理大小,因为我有相当大的内存容量。您正在使用ModelCheckpoint回调并以最佳的验证精度保存模型。最好以最小的验证损失保存模型。您说增加样本数量会导致拟合不足。那似乎很奇怪。通常,更多的样本会导致更好的准确性。