我目前正在使用Scikit-Learn版本0.19.2和Python 3.6.3
由于某些原因,我无法从我的cv_results_
访问GridSearchCV
属性。
这是我正在使用的代码:
df = pd.read_csv(input_file, sep = ";", header=None)
numpy_array = df.as_matrix()
y=numpy_array[:,1]
y[y=='RR']=1
y[y=='AIRR']=0
print(y)
y=y.astype('int')
vectorizer = TfidfVectorizer(sublinear_tf=True, max_df=0.5, stop_words=stopwords)
X=numpy_array[:,0]
X=vectorizer.fit_transform(X)
param_grid = {"base_estimator__criterion" : ["gini", "entropy"],
"base_estimator__splitter" : ["best", "random"],
"n_estimators": [1, 2]
}
DTC = DecisionTreeClassifier(random_state = 11, max_features = "auto", class_weight = "balanced",max_depth = None)
# Create and fit an AdaBoosted decision tree
bdt = AdaBoostClassifier(base_estimator = DTC)
grid_search_ABC = GridSearchCV(bdt, param_grid=param_grid, scoring = 'roc_auc', cv=5, refit=True)
pred = grid_search_ABC.fit(X,y)
print(metrics.confusion_matrix(y, pred))
mean=grid_search_ABC.cv_results_['mean_test_score']
std=grid_search_ABC.cv_results_['std_test_score']
我了解到,这主要与GridSearchCV
可能不适合有关,但我可以完全使用它来预测新实例等。
请问有指针吗?
答案 0 :(得分:-1)
问题可能出在您的数据集上。因此,本网站鼓励您发布可验证的示例。
我刚刚尝试在虹膜数据集上运行您的代码,效果很好:
from sklearn import datasets
from sklearn.model_selection import GridSearchCV
iris = datasets.load_iris()
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import AdaBoostClassifier
param_grid = {"base_estimator__criterion" : ["gini", "entropy"],
"base_estimator__splitter" : ["best", "random"],
"n_estimators": [1, 2]
}
DTC = DecisionTreeClassifier(random_state = 11, max_features = "auto", class_weight = "balanced",max_depth = None)
bdt = AdaBoostClassifier(base_estimator = DTC)
grid_search_ABC = GridSearchCV(bdt, param_grid=param_grid, scoring = 'roc_auc', cv=5, refit=True)
pred = grid_search_ABC.fit(iris.data, iris.target>0)
print(grid_search_ABC.cv_results_['mean_test_score'])
效果很好