朴素贝叶斯分类器动态训练

时间:2020-05-26 14:04:46

标签: python scikit-learn naivebayes online-machine-learning

是否有可能(以及如何实现)动态训练sklearn MultinomialNB分类器? 每当我向其中发送电子邮件时,我都希望训练(更新)我的垃圾邮件分类器。

我想要这个(不起作用):

x_train, x_test, y_train, y_test = tts(features, labels, test_size=0.2)
clf = MultinomialNB()
for i in range(len(x_train)):
    clf.fit([x_train[i]], [y_train[i]])
preds = clf.predict(x_test)

具有与此类似的结果(可以正常运行):

x_train, x_test, y_train, y_test = tts(features, labels, test_size=0.2)
clf = MultinomialNB()
clf.fit(x_train, y_train)
preds = clf.predict(x_test)

1 个答案:

答案 0 :(得分:2)

Scikit-learn支持增量学习多种算法,包括MultinomialNB。检查文档here

您需要使用方法partial_fit()而不是fit(),因此示例代码如下:

x_train, x_test, y_train, y_test = tts(features, labels, test_size=0.2)
clf = MultinomialNB()
for i in range(len(x_train)):
    if i == 0:
        clf.partial_fit([x_train[i]], [y_train[I]], classes=numpy.unique(y_train))
    else:
        clf.partial_fit([x_train[i]], [y_train[I]])
preds = clf.predict(x_test)

编辑:按照@BobWazowski的建议,将classes参数添加到partial_fit