Question

我得到的错误是这个。我的数据的子集[〜100k例]与原始数据集[400k例]的列数完全相同。但是它可以在原始数据集上完美运行，但不能在子集上运行。

Traceback (most recent call last)
<ipython-input-14-35cf02055a2e> in <module>()
     15 from h2o.estimators.gbm import H2OGradientBoostingEstimator
     16 gbm_cv3 = H2OGradientBoostingEstimator(nfolds=2)
---> 17 gbm_cv3.train(x=x, y=y, training_frame=train)
     18 ## Getting all cross validated models
     19 all_models = gbm_cv3.cross_validation_models()



error_count = 2
    http_status = 412
    msg = u'Illegal argument(s) for GBM model: 
GBM_model_python_1533214798867_179.  Details: ERRR on field: 
_response: Response cannot be constant.'
    dev_msg = u'Illegal argument(s) for GBM model: 
GBM_model_python_1533214798867_179.  Details: ERRR on field: 
_response: Response cannot be constant.'

Answer 1

这是用户错误。

“响应”是y列。对于给定的数据子集，每一行的y值均相同。当每个y值都相同时，您就无法训练有监督的机器学习模型-没有模型可供学习。

如果结果很少见，可能会发生这种情况-当您随机分割数据时，您可能会得到一个仅代表一个值的分区。要检查Python的“响应”列中有多少个唯一值，请执行以下操作：train[y].unique()

在一部分数据上运行但在原始数据上完美运行时出现h2o错误

1 个答案: