我正在尝试针对文本数据训练神经网络。
我已经使用Gensim对数据进行矢量化处理。 我想在最后预测文本类型。
(e.g. train data
<text> type1 type2 type3 type4 type5)
"hello world" false false true false true
.
.
.
"goodbye world" true true true false true
X_train = KeyedVectors.load(train_vectors, mmap='r')
X_train= X_train.vectors.reshape(X_train.vectors.shape[0],100, 1)
我的矢量数据集的形状是(1019471,100)
model = Sequential()
model.add(Input(shape=(100,1)))
model.add(Conv1D(32, 3, input_shape=(1, 100), activation='relu'))
model.add(Dense(16, activation='relu'))
model.add(Dense(5, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(X_train, y_train,
epochs=4,
validation_data=(X_validate, y_validate),
verbose=args["verbose"])
出现此错误:
' while using as loss `' + loss_name + '`. '
ValueError: A target array with shape (1019471, 5) was passed for an output of shape (None, 98, 5) while using as loss `categorical_crossentropy`. This loss expects targets to have the same shape as the output.
我无法知道形状(无,98、5)来自何处,如何获得更好的模型?