我正在进行方言文本分类,一条推文可以作为输入,然后您可以预测我拥有这条推文所属的5个类
所以首先我在这里有word2vec模型,该模型使用没有标签的数据进行训练:
model = gensim.models.Word2Vec (documents, size=150, window=10, min_count=2, workers=10)
model.train(documents,total_examples=len(documents),epochs=10)
我有用于神经网络的以下代码:
from keras.preprocessing import text, sequence
from keras import layers, models, optimizers
def create_model_architecture(input_size):
# create input layer
input_layer = layers.Input((input_size, ), sparse=True)
# create hidden layer
hidden_layer = layers.Dense(100, activation="relu")(input_layer)
# create output layer
output_layer = layers.Dense(4, activation="sigmoid")(hidden_layer)
classifier = models.Model(inputs = input_layer, outputs = output_layer)
classifier.compile(optimizer=optimizers.Adam(), loss='binary_crossentropy',metrics=['accuracy'])
return classifier
classifier = create_model_architecture(X.shape[1])
# fit the training dataset on the classifier
classifier.fit(train_X, train_y,epochs=1)
# predict the labels on validation dataset
predictions = classifier.predict(X)
predictions = predictions.argmax(axis=-1)
print(predictions)
print(metrics.accuracy_score(predictions, train_y))
我想分类为5个类别。我已经将类别转换为train_y的一种热编码器。但是我不知道如何将word2vec作为输入层插入神经网络(如果这是我想做的),然后让它训练为5类。