Question

我有pkl我的分类器并在另一个笔记本中打开并尝试对分类器进行partial_fit但收到错误功能数量378与之前的数据4598不匹配。

with open("models/count_vect_Item Group.pkl", 'r') as f:
 global count_vect_item_group
 count_vect_item_group = joblib.load(f)

with open("models/model_Item Group.pkl", 'r') as f:
 global model_predicted_item_group
 model_predicted_item_group = joblib.load(f)

count_matrix_X_train = count_vect_item_group.fit_transform(X_test)
X_train_tf_idf = tf_idf(count_matrix_X_train)

model_predicted_item_group.partial_fit(X_train_tf_idf, labels_test )

无法使用新数据集训练我的分类器。

Answer 1

这个错误是因为在你腌制你的分类器之前，你训练它有4598个特征（X中的列数），现在只有378个。它应该等于旧功能。

如何通过仅调用count_vect_item_group.transform()来实现此目的。您现在再次调用count_vect_item_group上的fit_transform（），然后忘记以前学过的数据，并适应新数据，因此找到的功能数量比以前少。

将您的代码更改为：

count_matrix_X_train = count_vect_item_group.transform(X_test)
X_train_tf_idf = tf_idf(count_matrix_X_train)

model_predicted_item_group.partial_fit(X_train_tf_idf, labels_test)

SKlearn SGD Partial Fit错误：要素数量378与先前数据4598不匹配

1 个答案: