在运行期间计算python中GradientBoostingClassifier的损失

时间:2015-05-09 20:53:19

标签: python scikit-learn gradient-descent

我有以下用于创建和训练sklearn.ensemble.GradientBoostingClassifier的代码

import sys

然而,当我运行此代码时,我收到错误:

class myMonitor:
    def __call__(self, i, estimator, locals):
        proba = estimator.predict_proba(Xp2)
        myloss = calculateMyLoss(proba, yp2) # calculateMyLoss is defined 
                                             # further on
        print("Calculated MYLOSS: ",myloss)
        return False

... #some more code

model = GradientBoostingClassifier(verbose=2, learning_rate = learningRate, n_estimators=numberOfIterations, max_depth=maxDepth, subsample = theSubsample, min_samples_leaf = minLeafSamples, max_features=maxFeatures)
model.fit(Xp1, yIntegersp1, monitor = myMonitor())

为什么我不能使用相同的估算器(我检查不是 model.fit(Xp1, yIntegersp1, monitor = myMonitor()) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sklearn/ensemble/gradient_boosting.py", line 980, in fit begin_at_stage, monitor) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sklearn/ensemble/gradient_boosting.py", line 1058, in _fit_stages early_stopping = monitor(i, self, locals()) File "OTTOSolverGBM.py", line 44, in __call__ proba = estimator.predict_proba(Xp2) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sklearn/ensemble/gradient_boosting.py", line 1376, in predict_proba score = self.decision_function(X) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sklearn/ensemble/gradient_boosting.py", line 1102, in decision_function score = self._decision_function(X) File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/sklearn/ensemble/gradient_boosting.py", line 1082, in _decision_function predict_stages(self.estimators_, X, self.learning_rate, score) File "sklearn/ensemble/_gradient_boosting.pyx", line 115, in sklearn.ensemble._gradient_boosting.predict_stages (sklearn/ensemble/_gradient_boosting.c:2502) AttributeError: 'NoneType' object has no attribute 'tree_' )来计算运行期间的类概率?有没有办法实现我想要的(即在拟合程序的每次迭代中检查验证数据的模型)?

2 个答案:

答案 0 :(得分:1)

您的estimator self。尝试

def __call__(self, i, locals)
    proba = self.predict_proba(Xp2)

答案 1 :(得分:0)

您可能会根据类似于this example on forests的partial_fit执行某些操作。如需培训后进行分析,请查看this example on gradient boosting.