Question

我需要一个SKLearn分类器，该分类器可以通过将新数据拟合到已经训练的算法，同时保留已学习到的适合以前的类的方法来进行定期训练。

我尝试在LogisticRegression和RandomForestClassifier上使用warm_start，但似乎无法弄清楚从先前的拟合中保留类所需要的配置。使用我开发的再训练python工具，可以使用pickle模块加载并保存以前保存的sklearn算法。

#Train File:

algorithm = LogisticRegression(warm_start=True)

#Fit Data, (X,y) is (100,2008),(100,) No y value is the same 
algorithm.fit(X,y)

print(len(algorithm.classes_)) #Expected 100

#Save Trained Object
with open("logreg.alg", "wb") as f:
   pickle.dump(algorithm,f)


#Re-train File:

#Load Trained Object
with open("logreg.alg", "rb") as f:
   algorithm = pickle.load(f)

#Fit New Data, (X,y) again is (100,2008),(100,) No y value has been trained before or is the same
algorithm.fit(X,y)

print(len(algorithm.classes_)) #Expected 200, Actual 100

#Save Trained Object
with open("logreg.alg", "wb") as f:
   pickle.dump(algorithm,f)

我希望每次重新拟合后类的数量都会增加，但是似乎算法在每次运行后都会重置其值。例如，第一个拟合应该设置从“ 0”到“ 99”的类，现在我想再次与“ 100”到“ 199”的类拟合，以具有从“ 0”到“ 199”的经过训练的类算法。 / p>

我做错了什么还是误解了“ warm_start”参数？我很想使用Logistic回归，但对其他分类器开放。

谢谢！

如何为SKLearn算法添加新的类

0 个答案: