我创建了一个简单的LSTM模型,但是无论我使用多少个纪元,我的验证准确性始终围绕50个左右。与训练准确性相比,它的外观如下:
Epoch 15/50
2527/2527 [==============================] - 22s 9ms/step - loss: 0.9408 - accuracy: 0.7999 - val_loss: 3.5255 - val_accuracy: 0.5190
Epoch 16/50
2527/2527 [==============================] - 22s 9ms/step - loss: 0.8724 - accuracy: 0.8080 - val_loss: 3.6279 - val_accuracy: 0.5127
Epoch 17/50
2527/2527 [==============================] - 22s 9ms/step - loss: 0.8041 - accuracy: 0.8177 - val_loss: 3.6627 - val_accuracy: 0.5158
Epoch 18/50
2527/2527 [==============================] - 22s 9ms/step - loss: 0.7377 - accuracy: 0.8297 - val_loss: 3.7247 - val_accuracy: 0.5140
Epoch 19/50
2527/2527 [==============================] - 22s 9ms/step - loss: 0.6680 - accuracy: 0.8431 - val_loss: 3.8000 - val_accuracy: 0.5144
Epoch 20/50
2527/2527 [==============================] - 22s 9ms/step - loss: 0.6036 - accuracy: 0.8578 - val_loss: 3.9164 - val_accuracy: 0.5051
Epoch 21/50
2527/2527 [==============================] - 22s 9ms/step - loss: 0.5460 - accuracy: 0.8715 - val_loss: 3.9832 - val_accuracy: 0.5089
Epoch 22/50
2527/2527 [==============================] - 22s 9ms/step - loss: 0.4830 - accuracy: 0.8872 - val_loss: 4.0284 - val_accuracy: 0.5095
Epoch 23/50
2527/2527 [==============================] - 22s 9ms/step - loss: 0.4277 - accuracy: 0.9019 - val_loss: 4.1428 - val_accuracy: 0.5067
Epoch 24/50
2527/2527 [==============================] - 22s 9ms/step - loss: 0.3760 - accuracy: 0.9169 - val_loss: 4.1972 - val_accuracy: 0.5069
Epoch 25/50
2527/2527 [==============================] - 22s 9ms/step - loss: 0.3319 - accuracy: 0.9275 - val_loss: 4.2494 - val_accuracy: 0.5047
Epoch 26/50
2527/2527 [==============================] - 22s 9ms/step - loss: 0.2883 - accuracy: 0.9406 - val_loss: 4.3047 - val_accuracy: 0.5075
Epoch 27/50
2527/2527 [==============================] - 22s 9ms/step - loss: 0.2471 - accuracy: 0.9507 - val_loss: 4.3822 - val_accuracy: 0.5063
Epoch 28/50
2527/2527 [==============================] - 22s 9ms/step - loss: 0.2131 - accuracy: 0.9592 - val_loss: 4.4553 - val_accuracy: 0.5071
我认为它可能会过拟合,但是我对Keras提供的validation_split
函数有疑问。它甚至可以洗牌吗?
无论如何,这是我从头开始的全部代码,甚至是我如何输入的信息,因此您可以告诉我如何修改它,从批处理大小到最后一个节点大小等。请看看并告诉我如何对其进行优化,以便我的验证准确性可以提高。
BATCH_SIZE = 64
EPOCHS = 50
LSTM_NODES =256
NUM_SENTENCES = 3000
MAX_SENTENCE_LENGTH = 50
MAX_NUM_WORDS = 3000
EMBEDDING_SIZE = 100
input_sentences = []
output_sentences = []
output_sentences_inputs = []
count = 0
for line in open(r'/content/drive/My Drive/TEMPPP/123.txt', encoding="utf-8"):
count += 1
if count > NUM_SENTENCES:
break
if '\t' not in line:
continue
input_sentence, output = line.rstrip().split('\t')
output_sentence = output + ' <eos>'
output_sentence_input = '<sos> ' + output
input_sentences.append(input_sentence)
output_sentences.append(output_sentence)
output_sentences_inputs.append(output_sentence_input)
input_tokenizer = Tokenizer(num_words=MAX_NUM_WORDS)
input_tokenizer.fit_on_texts(input_sentences)
input_integer_seq = input_tokenizer.texts_to_sequences(input_sentences)
word2idx_inputs = input_tokenizer.word_index
max_input_len = max(len(sen) for sen in input_integer_seq)
output_tokenizer = Tokenizer(num_words=MAX_NUM_WORDS, filters='')
output_tokenizer.fit_on_texts(output_sentences + output_sentences_inputs)
output_integer_seq = output_tokenizer.texts_to_sequences(output_sentences)
output_input_integer_seq = output_tokenizer.texts_to_sequences(output_sentences_inputs)
word2idx_outputs = output_tokenizer.word_index
num_words_output = len(word2idx_outputs) + 1
max_out_len = max(len(sen) for sen in output_integer_seq)
encoder_input_sequences = pad_sequences(input_integer_seq, maxlen=max_input_len)
decoder_input_sequences = pad_sequences(output_input_integer_seq, maxlen=max_out_len, padding='post')
import numpy as np
read_dictionary = np.load('/content/drive/My Drive/TEMPPP/hinvec.npy',allow_pickle='TRUE').item()
num_words = min(MAX_NUM_WORDS, len(word2idx_inputs) + 1)
embedding_matrix = np.zeros((num_words, EMBEDDING_SIZE))
for word, index in word2idx_inputs.items():
embedding_vector = read_dictionary.get(word)
if embedding_vector is not None:
embedding_matrix[index] = embedding_vector
embedding_layer = Embedding(num_words, EMBEDDING_SIZE, weights=[embedding_matrix], input_length=max_input_len)
decoder_targets_one_hot = np.zeros((
len(input_sentences),
max_out_len,
num_words_output
),
dtype='float32'
)
decoder_output_sequences = pad_sequences(output_integer_seq, maxlen=max_out_len, padding='post')
for i, d in enumerate(decoder_output_sequences):
for t, word in enumerate(d):
decoder_targets_one_hot[i, t, word] = 1
encoder_inputs_placeholder = Input(shape=(max_input_len,))
x = embedding_layer(encoder_inputs_placeholder)
encoder = LSTM(LSTM_NODES, return_state=True)
encoder_outputs, h, c = encoder(x)
encoder_states = [h, c]
decoder_inputs_placeholder = Input(shape=(max_out_len,))
decoder_embedding = Embedding(num_words_output, LSTM_NODES)
decoder_inputs_x = decoder_embedding(decoder_inputs_placeholder)
decoder_lstm = LSTM(LSTM_NODES, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs_x, initial_state=encoder_states)
decoder_dense = Dense(num_words_output, activation='softmax')
decoder_outputs = decoder_dense(decoder_outputs)
import tensorflow as tf
starter_learning_rate = 0.1
end_learning_rate = 0.01
decay_steps = 2000
learning_rate_fn = tf.keras.optimizers.schedules.PolynomialDecay(
starter_learning_rate,
decay_steps,
end_learning_rate,
power=0.5)
opt = tf.keras.optimizers.Adam(learning_rate=learning_rate_fn, epsilon=1e-03, clipvalue=0.5)
model = Model([encoder_inputs_placeholder,
decoder_inputs_placeholder],
decoder_outputs)
model.compile(
optimizer=opt,
loss='categorical_crossentropy',
metrics=['accuracy']
)
history = model.fit(
[encoder_input_sequences, decoder_input_sequences],
decoder_targets_one_hot,
batch_size=BATCH_SIZE,
epochs=EPOCHS,
validation_split=0.1,
)
我尝试添加辍学层,但无法在LSTM层和致密层之间添加它。我对validation_split
表示怀疑。我试图在train_test_set和valid_test_set中拆分数据集,但计数使它起作用,最终坚持使用validation_split
。我很确定这是过度拟合但无法处理的情况。