TypeError:“ __ ensemble__”对象不可调用

时间:2019-01-05 11:24:42

标签: python-3.x scikit-learn classification typeerror

我不知道为什么它不要求任何合奏。也许有些参数混乱了?

森林覆盖类型数据:
X =(581012,54)的形状
y =(581012,)的形状

from sklearn.ensemble import VotingClassifier
from sklearn.ensemble import BaggingClassifier
from sklearn.ensemble import AdaBoostClassifier
from sklearn import model_selection

classifier_names = ["logistic regression", "linear SVM", "nearest centroids", "decision tree"]
classifiers = [LogisticRegression, LinearSVC, NearestCentroid, DecisionTreeClassifier]

ensemble1 = VotingClassifier(classifiers)
ensemble2 = BaggingClassifier(classifiers)
ensemble3 = AdaBoostClassifier(classifiers)
ensembles = [ensemble1, ensemble2, ensemble3]
seed = 7  

for ensemble in ensembles:
    kfold = model_selection.KFold(n_splits=10, random_state=seed)
    for classifier in classifiers:
        model = ensemble(base_estimator=classifier, random_state=seed)
        results = model_selection.cross_val_score(ensemble, X, Y, cv=kfold)
        print(results.mean())    

我希望这些合奏为分类器运行,但是第一个合奏没有运行。我先将顺序更改为BaggingClassifier,但是显示了相同的错误,无法调用。

2 个答案:

答案 0 :(得分:0)

对于VotingClassifier,估计量应为具有名称和模型的元组列表。 请注意,您已经创建了一个模型类,然后在元组内部给出。

来自Documentation:

  

估计器 :(字符串,估计器)元组的列表   VotingClassifier上的方法将适合那些原始副本   将存储在class属性中的估计量   自我估计可以使用set_params将估算器设置为“无”。

对于其他两个集合,您只能对一个基本模型求值,而对同一基本模型求n个估算器。就像执行代码一样,遍历不同的分类器,但是每次都重新定义了集成模型。

  

base_estimator :对象或无,可选(默认=无)   估计量以适合数据集的随机子集。如果没有,则   基本估计量是决策树。

     

n_estimators :整数,可选(默认值= 10)基数   合奏中的估计量。

尝试一下!

iris = datasets.load_iris()
X, y = iris.data[:, 1:3], iris.target

classifier_names = ["logistic regression","linear SVM","nearest centroids", "decision tree"]
classifiers = [LogisticRegression(), LinearSVC(), NearestCentroid(), DecisionTreeClassifier()]


ensemble1 = VotingClassifier([(n,c) for n,c in zip(classifier_names,classifiers)])
ensemble2 = BaggingClassifier(base_estimator= DecisionTreeClassifier() , n_estimators= 10)
ensemble3 = AdaBoostClassifier(base_estimator= DecisionTreeClassifier() , n_estimators= 10)
ensembles = [ensemble1,ensemble2,ensemble3]
seed = 7  

for ensemble in ensembles:
    kfold = model_selection.KFold(n_splits=10, random_state=seed)
    results = model_selection.cross_val_score(ensemble, X, y, cv=kfold)
    print(results.mean())    

答案 1 :(得分:0)

from sklearn.ensemble import RandomForestClassifier, AdaBoostClassifier, BaggingClassifier, VotingClassifier
from sklearn import model_selection
import warnings 
warnings.filterwarnings("ignore") 

seed = 7 
classifier_names = ["logistic regression","linear SVM","nearest centroids", "decision tree"]
classifiers = [LogisticRegression, LinearSVC, NearestCentroid, DecisionTreeClassifier]
for classifier in classifiers:
    ensemble1 = RandomForestClassifier(estimator=classifier(), n_estimators= 20, random_state=seed)
    ensemble2 = AdaBoostClassifier(base_estimator=classifier(), 
                                   n_estimators= 5, learning_rate=1, random_state=seed)
    ensemble3 = BaggingClassifier(base_estimator=classifier(), 
                                  max_samples=0.5, n_estimators=20, random_state=seed)
    ensemble4 = VotingClassifier([(n,c) for n,c in zip(classifier_namess, classifiers)], voting="soft")
    ensembles = [ensemble1, ensemble2, ensemble3, ensemble4]

for ensemble in ensembles:
    kfold = model_selection.KFold(n_splits=10, random_state=seed)
    results = model_selection.cross_val_score(ensemble, X[1:100], y[1:100], cv=kfold)
    print("The mean accuracy of {}:".format(results.mean()))