Question

如果我希望分类器是SVM（使用scikit-learn），如何修改'clf'变量，以便用于特征排名的svm分类器具有较高的准确性？我需要添加什么参数/参数？您建议使用哪种内核类型的SVC（“线性”或“ rbf”或“ Sigmoid”或其他）？通过以下github链接引用代码： https://github.com/CynthiaKoopman/Network-Intrusion-Detection/blob/master/RandomForest_IDS.ipynb

我有10个特征（使用scikit学习的RecursiveFeatureElimination）从1到10排名，这些特征是使用RandomForestClassifier以99％的准确性（使用RFC作为预测模型）从NSL-KDD数据集的DoS攻击中得出的。

from sklearn.feature_selection import RFE
from sklearn.ensemble import RandomForestClassifier
#from sklearn.svm import SVC

# Create a decision tree classifier. clf is the 'variable for classifier'

clf = RandomForestClassifier(n_jobs = 2)

# If classifier used is svm
#clf = SVC(kernel = "linear")
#rank all features, i.e continue the elimination until the last one

rfe = RFE(clf, n_features_to_select=1)
rfe.fit(X_newDoS, Y_DoS)
print ("DoS Features sorted by their rank:")
#print (sorted(zip(map(lambda x: round(x, 4), rfe.ranking_), newcolname_DoS)))
sorted_newcolname_DoS = sorted(zip(map(lambda x: round(x, 4), rfe.ranking_), newcolname_DoS))
sorted_newcolname_DoS

我期望两个分类器的排序特征之间或多或少有99％的相似性，而我没有观察到。

如何通过支持向量分类来实现此功能排名问题？

0 个答案: