Question

我不完全了解模型的工作原理。

我有一段文字。我想教我的LSTM NN根据先前的顺序来预测句子中的下一个单词。即：我从单词word_0的第一个单词开始，然后预测单词_1。然后按word_0和word_1预测word_2，然后按word_0，word_1和word_2->预测word_3，依此类推，直到句子结尾。

作为预处理，我使用Word2Vec并将所有单词矢量化，这是LSTM的输入。如您所知，我将addSnapshotListener()用作损失

我正在训练我的模型，训练后检查我在此数据上的结果（训练过的模型）。我希望该模型可以预测与训练数据中相同的句子。 但不是！（在所有时期之后，acc-> 1并丢失->）怎么了？为什么我的期望不正确？

我认为当我使用'sparse_categorical_crossentropy'时，每个单词都将是整数，所有数据将是整数序列。 LSTM将通过该数字进行训练。

这已通过Word2Vec设置进行了预训练：

sparse_categorical_crossentropy

这是准备数据：

pretrained_weights = word_model.wv.syn0
vocab_size, emdedding_size = pretrained_weights.shape

位置：

train_x = np.zeros([len(sentences), max_sentence_len], dtype=np.int32)
train_y = np.zeros([len(sentences)], dtype=np.int32)
for i, sentence in enumerate(sentences):
    for t, word in enumerate(sentence[:-1]):
        train_x[i, t] = word2idx(word)
    train_y[i] = word2idx(sentence[-1])

这是LSTM设置：

def word2idx(word):
    return word_model.wv.vocab[word].index

UPD 1 ：据我了解，嵌入层不正确！ model = Sequential() model.add(Embedding(input_dim=vocab_size, output_dim=emdedding_size, weights=[pretrained_weights])) model.add(LSTM(units=emdedding_size, return_sequences=True)) model.add(LSTM(units=emdedding_size)) model.add(Dense(units=vocab_size)) model.add(Activation('softmax')) model.compile(optimizer=optimizers.Nadam(lr=0.002, beta_1=0.9, beta_2=0.999, epsilon=None, schedule_decay=0.004), loss='sparse_categorical_crossentropy', metrics=['accuracy'])不正确。 Keras已更新，并以documentation的新形式出现。我已经尝试过：

weights=[pretrained_weights]

也许不，我不知道。该模型正在训练中（我只有CPU，这个过程可能要花费数次）

Keras，LSTM。稀疏分类交叉熵

0 个答案: