我想将一个sklearn.ensemble.GradientBoostingClassifier用于 init 另一个sklearn.ensemble.GradientBoostingClassifier,但它引发错误“IndexError:数组索引太多”。它接收到了sklearn中的错误,并且我在sklearn on GitHub和sklearn on GitHub中找到了拉取请求。如果有人试图这样做并有积极的经验,请告诉我。
系统信息:
MacOS X 10.11.5(15F34)
python -V:Python 2.7.11 :: Anaconda custom(x86_64)
sklearn .__ version__:'0.17.1'
以上是显示此错误的示例代码。
from sklearn.datasets import load_iris
from sklearn import ensemble
from sklearn.cross_validation import train_test_split
iris = load_iris()
X, y = iris.data, iris.target
X, y = X[y < 2], y[y < 2] # make it binary
X_train, X_test, y_train, y_test = train_test_split(X, y)
# Fit GBT init with RF
clf = ensemble.GradientBoostingClassifier()
clf.fit(X_train, y_train)
clf2 = ensemble.GradientBoostingClassifier(init=clf)
clf2.fit(X_train, y_train)
acc = clf2.score(X_test, y_test)
print("Accuracy: {:.4f}".format(acc2))