plt.plot(range(1, len(vt_selector.grid_scores_) + 1), vt_selector.grid_scores_)
Traceback (most recent call last):
File "<ipython-input-382-35feb916b0a1>", line 1, in <module>
plt.plot(range(1, len(vt_selector.grid_scores_) + 1), vt_selector.grid_scores_)
**AttributeError: 'VarianceThreshold' object has no attribute 'grid_scores_'**
我的完整代码:
#Feature Selection
import pandas as pd
import numpy as np
from sklearn import preprocessing as pp
from sklearn import model_selection as ms
from sklearn.feature_selection import VarianceThreshold, SelectKBest, chi2, mutual_info_classif, RFECV, SelectFromModel
from sklearn import svm
from sklearn.ensemble import ExtraTreesClassifier
from sklearn.feature_selection import VarianceThreshold
from sklearn.decomposition import PCA
import seaborn
import matplotlib.pyplot as plt
X_orig = pd.read_csv('data.csv',encoding='latin1') #import CSV dataset and targets
y=pd.read_csv('target.csv')
Xn = pp.normalize(X_orig) #normalize input using l2 norm
X = pd.DataFrame(Xn)
#remove all low-variance features
vt_selector = VarianceThreshold(threshold=0.01)
#No feature in X meets the variance threshold 0.01
vt_res = vt_selector.fit_transform(X)
vt_selector.get_support()
#select the k best features using f_classif metric
kb_f_selector = SelectKBest(k=10)
#x-squares requires features to be strictly positive
from sklearn.preprocessing import MinMaxScaler
scaler = pp.MinMaxScaler()
X_x2 = MinMaxScaler(feature_range=(0, 1)).fit_transform(X)
#select the k best features using x-squared metric
kb_x2_selector = SelectKBest(chi2, k=10)
kb_x2_res = kb_x2_selector.fit_transform(X_x2, y)
#select the k best features using mutual information metric
selector = SelectKBest(mutual_info_classif, k=10)
kb_mi_res = selector.fit_transform(X, y)
#recursive feature elimination
estimator = svm.SVC(kernel="linear")
cv = ms.StratifiedShuffleSplit(n_splits=5, test_size=0.2, random_state=42)
rfecv_selector = RFECV(estimator, step=1, cv=cv, scoring='average_precision', verbose=49, n_jobs=-1)
rfecv_selector = selector.fit(X, y)
rfecv_selector.get_support()
#plots: replace vt_selector with desired selector
plt.figure()
plt.xlabel("Number of features selected")
plt.ylabel("Cross validation score")
plt.plot(range(1, len(vt_selector.grid_scores_) + 1), vt_selector.grid_scores_)
plt.show()
答案 0 :(得分:0)
我不知道您在这里到底在做什么,但是您的vt_selector
是VarianceTreshold
对象,并且没有grid_scores_
字段。您可能将其与RFECV
混淆了。
答案 1 :(得分:0)
快速提示:
首先:的确,VarianceThreshold对象没有grid_scores_属性。 grid_scores_应用于GridSearchCV,已被折旧并替换为cv_results _
第二:如果要访问它,则应分配数组vt_selector.get_support()
support_vt_selector = vt_selector.get_support()
否则它是没有用的,并且vt_selector仍然是代码中VarianceThreshold的实例,并且仍然没有grid_scores_属性。