我正在用keras在python中进行情感分析项目,使用CNN和word2vec作为嵌入方法。我想检测正面,负面和中性的推文(在我的语料库中,每个负面的推文都获得$ MANWIDTH=160 man gcc | grep '\-s$'
标签,0
和positive = 1
)
然后以这种方式设置标签:
***假设neutral = 2
和X-train
包含推文,X-test
和Y-test包含推文标签。
Y-train
我的keras模型是:
if labels[index] == 0 :
Y_train[i, :] = [1.0, 0.0]
elif labels[index] == 1 :
Y_train[i, :] = [0.0, 1.0]
else:
Y_train[i, :] = [0.5, 0.5]
以及编译代码和模型拟合为:
model = Sequential()
model.add(Conv1D(32, kernel_size=3, activation='elu', padding='same',
input_shape=(15,512)))
model.add(Conv1D(32, kernel_size=3, activation='elu', padding='same'))
model.add(Conv1D(32, kernel_size=3, activation='elu', padding='same'))
model.add(Conv1D(32, kernel_size=3, activation='elu', padding='same'))
model.add(Dropout(0.25))
model.add(Conv1D(32, kernel_size=2, activation='elu', padding='same'))
model.add(Conv1D(32, kernel_size=2, activation='elu', padding='same'))
model.add(Conv1D(32, kernel_size=2, activation='elu', padding='same'))
model.add(Conv1D(32, kernel_size=2, activation='elu', padding='same'))
model.add(Dropout(0.25))
model.add(Dense(256, activation='tanh'))
model.add(Dense(256, activation='tanh'))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(2, activation='sigmoid'))
我的问题是:
正如我之前在语料库中提到的那样,我将model.compile(loss='categorical_crossentropy',
optimizer=Adam(lr=0.0001, decay=1e-6),
metrics=['accuracy'])
model.fit(np.array(X_train),np.array(Y_train)
batch_size=batch_size,
shuffle=True,
epochs=nb_epochs,
validation_data=(np.array(X_test),np.array(Y_test)),
callbacks=[EarlyStopping(min_delta=0.00025, patience=2)])
设置为极性标签,并以这种方式考虑0,1,2
和Y_train
:
Y-tets
以这种方式预测输入新推文在逻辑上是正确的:
Y_train[i, :] = [1.0, 0.0] ##for negative tweets with 0 label in corpus.(and the same for 1,2)
感谢您的耐心