下面是most_informative_feature_for_binary_classification方法:
def most_informative_feature_for_binary_classification(vectorizer, classifier, n=100):
"""
Identify most important features if given a vectorizer and binary classifier. Set n to the number
of weighted features you would like to show.
"""
# additional stopwords to be remove
# Open a file and read it into memory
file = open('..\stopwords.txt')
additional_stopwords = file.read()
additional_stopwords = additional_stopwords.split()
class_labels = classifier.classes_
feature_names = vectorizer.get_feature_names()
feature_names = [word for word in feature_names if word not in additional_stopwords]
topn_class1 = sorted(zip(classifier.coef_[0], feature_names))[:n]
topn_class2 = sorted(zip(classifier.coef_[0], feature_names))[-n:]
# class_labels = category
# coef = co-effecient
# feat = most informative feature
for coef, feat in topn_class1:
print(class_labels[0], coef, feat)
print()
for coef, feat in reversed(topn_class2):
print(class_labels[1], coef, feat)
目前,我只能打印2类for循环中上面显示的两个类别...
但是我想每次程序运行时得到2个不同的类别... 例如,第一次运行程序->输出将为3和4类,第二次运行程序将为4和5类 第三次运行程序:5级和6级 第4次程序运行:第6和第7类 第五次运行程序:7年级和8年级 它将再次经历这个循环
请帮助我看看我的代码:((