Question

我正在使用sklearn的GradientBoostingRegression方法。因此，在拟合了2000个估算器之后，我想为它添加更多的估算器。由于重新运行整个拟合过程需要很长时间，因此我使用了set_params（）方法。请注意，这是一个多目标问题，这意味着我有3个目标可以适应。所以我使用以下代码添加更多估算器。

'''parameters: models (list  of length 3 in our case )
               train_X, train_y [n_samples x 3], test
               n_estimators : previous + 500 (default) [additional estimators]
               warm_start : True (default)
'''

def addMoreEstimators(train_X, train_y, test, models, n_estimators = 500, warm_start=True):
    params = {'n_estimators':n_estimators, 'warm_start':warm_start}

    gbm_pred= pd.DataFrame()

    for (i,stars),clf in zip(enumerate(['*','**','***']), models):
        clf.set_params(**params)
        %time clf.fit(train_X.todense(),train_y[stars])
        %time gbm_pred[stars] = clf.predict(test.todense())

    gbm_pred = gbm_pred.as_matrix()    
    gbm_dict ={'model': gbm, 'prediction': gbm_pred}

    return gbm_dict

注意：models参数是3个目标的3个拟合模型的列表。

当我第一次使用2500（最初我有2000个估算器）运行它时，它运行正常，并给了我一个输出。

当我使用3000个估算器运行相同的函数时，我得到一个AttributeError（参见下面错误的回溯）。这里的模型包含3个拟合模型。下面是错误的追溯:(它有点长）

AttributeError                            Traceback (most recent call last)
<ipython-input-104-9418ada3b36f> in <module>()
      7                                                                 test = val_X_tfidf[:,shortened_col_index],
      8                                                                 models = models,
----> 9                                                                 n_estimators = 3000)
     10 
     11 reduced_features_gbm_pred_3000_2_lr_1_msp_2 = reduced_features_gbm_model_3000_2_lr_1_msp_2['prediction']

<ipython-input-103-e15a4fb70b50> in addMoreEstimators(train_X, train_y, test, models, n_estimators, warm_start)
     15         
     16         clf.set_params(**params)
---> 17         get_ipython().magic(u'time clf.fit(train_X.todense(),train_y[stars])')
     18         print 'starting prediction'
     19 

//anaconda/lib/python2.7/site-packages/IPython/core/interactiveshell.pyc in magic(self, arg_s)
   2305         magic_name, _, magic_arg_s = arg_s.partition(' ')
   2306         magic_name = magic_name.lstrip(prefilter.ESC_MAGIC)
-> 2307         return self.run_line_magic(magic_name, magic_arg_s)
   2308 
   2309     #-------------------------------------------------------------------------

//anaconda/lib/python2.7/site-packages/IPython/core/interactiveshell.pyc in run_line_magic(self, magic_name, line)
   2226                 kwargs['local_ns'] = sys._getframe(stack_depth).f_locals
   2227             with self.builtin_trap:
-> 2228                 result = fn(*args,**kwargs)
   2229             return result
   2230 

//anaconda/lib/python2.7/site-packages/IPython/core/magics/execution.pyc in time(self, line, cell, local_ns)

//anaconda/lib/python2.7/site-packages/IPython/core/magic.pyc in <lambda>(f, *a, **k)
    191     # but it's overkill for just that one bit of state.
    192     def magic_deco(arg):
--> 193         call = lambda f, *a, **k: f(*a, **k)
    194 
    195         if callable(arg):

//anaconda/lib/python2.7/site-packages/IPython/core/magics/execution.pyc in time(self, line, cell, local_ns)
   1160         if mode=='eval':
   1161             st = clock2()
-> 1162             out = eval(code, glob, local_ns)
   1163             end = clock2()
   1164         else:

<timed eval> in <module>()

//anaconda/lib/python2.7/site-packages/sklearn/ensemble/gradient_boosting.pyc in fit(self, X, y, sample_weight, monitor)
    973                                     self.estimators_.shape[0]))
    974             begin_at_stage = self.estimators_.shape[0]
--> 975             y_pred = self._decision_function(X)
    976             self._resize_state()
    977 

//anaconda/lib/python2.7/site-packages/sklearn/ensemble/gradient_boosting.pyc in _decision_function(self, X)
   1080         # not doing input validation.
   1081         score = self._init_decision_function(X)
-> 1082         predict_stages(self.estimators_, X, self.learning_rate, score)
   1083         return score
   1084 

sklearn/ensemble/_gradient_boosting.pyx in sklearn.ensemble._gradient_boosting.predict_stages (sklearn/ensemble/_gradient_boosting.c:2502)()

AttributeError: 'int' object has no attribute 'tree_'

对于漫长的追溯感到抱歉，但我认为无法向我提供有意义的反馈。

再次，为什么我会收到这些反馈？

非常感谢任何帮助。

由于

修改下面是生成models的代码，该代码是上述函数中的输入之一。

from sklearn import ensemble
def updated_runGBM(train_X, train_y, test, 
           n_estimators =100, 
           max_depth = 1, 
           min_samples_split=1,
           learning_rate=0.01, 
           loss= 'ls',
           warm_start=True):
    '''train_X : n_samples x m_features
       train_y : n_samples x k_targets (multiple targets allowed)
       test    : n_samples x m_features
       warm_start : True (originally the default is False, but I want to add trees)
    '''
    params = {'n_estimators': n_estimators, 'max_depth': max_depth, 'min_samples_split': min_samples_split,
              'learning_rate': learning_rate, 'loss': loss,'warm_start':warm_start}
    gbm1 = ensemble.GradientBoostingRegressor(**params)
    gbm2 = ensemble.GradientBoostingRegressor(**params)
    gbm3 = ensemble.GradientBoostingRegressor(**params)

    gbm = [gbm1,gbm2,gbm3]

    gbm_pred= pd.DataFrame()
    for (i,stars),clf in zip(enumerate(['*','**','***']), gbm):
        %time clf.fit(train_X.todense(),train_y[stars])
        %time gbm_pred[stars] = clf.predict(test.todense())

    gbm_pred = gbm_pred.as_matrix()
    gbm_pred = np.clip(gbm_pred,0,np.inf)       
    gbm_dict ={'model': gbm, 'prediction': gbm_pred}       

    return gbm_dict

注意在上面的代码中，我删除了一些print语句以减少混乱。

这些是我正在使用的两个函数，没有别的（除了分割数据的代码）。

为什么我在Python中获得以下AttributeError？

0 个答案: