我对我的MultinomialNB模型进行了K折拆分。
我试图用SMOTE(imblearn.over_sampling,lib)平衡数据
NB_pipeline = Pipeline([
('tfidf', TfidfVectorizer(stop_words=stop_words)),
('clf', OneVsRestClassifier(MultinomialNB(
fit_prior=True, class_prior=None))),
])
for train_indices, test_indices in k_fold.split(train_data):
train_sequencies = train_data.iloc[train_indices]['NAME']
label_train = train_data.iloc[train_indices][['SEARCH','OPTIONS_VOLUME', 'OPTIONS_QUANTITY', 'OPTIONS_PORTION',
'OPTIONS_WEIGHT', 'OPTIONS_SIZE', 'OPTIONS_CONCENTRATION',
'OPTIONS_CONTENT', 'OPTIONS_MANUFACTURER']]
test_sequencies = train_data.iloc[test_indices]['NAME']
label_test = train_data.iloc[test_indices][['SEARCH','OPTIONS_VOLUME', 'OPTIONS_QUANTITY', 'OPTIONS_PORTION',
'OPTIONS_WEIGHT', 'OPTIONS_SIZE', 'OPTIONS_CONCENTRATION',
'OPTIONS_CONTENT', 'OPTIONS_MANUFACTURER']]
NB_pipeline.fit(train_sequencies, label_train)
predictions = pipeline.predict(test_sequencies)
confusion += confusion_matrics(test_sequencies, label_test)
score = f1_score(test_sequencies, label_test)
score.append(score)
我希望对多标签分类进行交叉验证
答案 0 :(得分:0)
OneVsRestClassifier在于为每个类别(而非每个目标)配备一个分类器。
由于MultinomialNB不支持多输出目标数据,因此您可以使用MultinomialNB将每个目标容纳一个MultiOutputClassifier。这是扩展自然不支持多目标分类的分类器的简单策略。
NB_pipeline = Pipeline([
('tfidf', TfidfVectorizer(stop_words=stop_words)),
('clf', MultiOutputClassifier(MultinomialNB( fit_prior=True, class_prior=None))),])
如果您要为每个类别和每个目标都适合一个分类器(该类别适合所有其他类别。)
NB_pipeline = Pipeline([
('tfidf', TfidfVectorizer(stop_words=stop_words)),
('clf', MultiOutputClassifier(OneVsRestClassifier(MultinomialNB( fit_prior=True, class_prior=None)))),])