Question

我试图建立一个简单的神经网络模型模型来进行下一个单词预测。

我有一个短语列表，我将这个短语列表分成3个单词列表。 E.g。

输入：

1. This is phrase number one.
2. This is phrase number two.

结果：

1. this is phrase
2. is phrase number
3. phrase number one
...
n. phrase number two

这对我来说是X值。对于Y值，我将每个短语中的第4个单词放入。 E.g。

X[1]: this is phrase
y[1]: number
...
X[n]: is phrase number
y[n]: two

最后，对X和Y进行编码，其值为0和1.例如

X[0]: 0 0 0 ... 1 0 1 0 ... 1 - maximum 3 of '1'
y[0]: 0 0 0 ... 1 0 0 0 ... 0 - maximum 1 of '1'

每个向量中的1代表我词汇表中单词的位置。

那是我的数据。我不太确定这是否是嵌入的最佳方式。另外一种方式更好吗？

接下来，我的模型是这样的：

model = Sequential()
model.add(Dense(12, input_dim=len(X_train[0]), kernel_initializer='uniform', activation='relu'))
model.add(Dense(8, input_dim=len(X_train[0]), kernel_initializer='uniform', activation ='relu'))
model.add(Dense(len(X_train[0]), input_dim=len(X_train[0]), kernel_initializer='uniform', activation ='sigmoid'))

# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# Fit the model
model.fit(X_train, y_train, epochs=10, batch_size=10, verbose=1)

但是当我想做一个预测时，我每次都会在下一个单词中复活。

恢复下一个单词预测的代码：

    y_pred = model.predict(X_test)
    for i in range(0, len(X_test)):
    current_sentence = []
    pred_word = []
    for j in range(0, len(X_test[i])):
        if X_test[i][j] == 1:
            current_sentence.append(vocabulary[j])

    pred_word.append(vocabulary[list(y_pred[i]).index(max(y_pred[i]))])

    print("---------------\n")
    print("Current sentence: ", current_sentence)
    print("Next word prediction: ", pred_word)

用于下一个单词预测的简单NN模型

0 个答案: