经过如此多的试验和错误,我终于设法建立了自己的堆叠模型。但我每次都无法产生(精确度)相同的效果。我知道我必须将random_state参数初始化为任何值,但即使在调用class方法之前将random_state值显式写入某个值之后,我仍然会得到随机结果。
class Stacking(BaseEstimator, ClassifierMixin):
def __init__(self, BaseModels, MetaModel, nfolds = 3, seed = 1):
self.BaseModels = BaseModels
self.MetaModel = MetaModel
self.nfolds = nfolds
self.seed = np.random.seed(seed) <---- This fixed my error. thanks to foladev.
def fit(self, X, y):
self.BaseModels_ = [list() for model in self.BaseModels]
self.MetaModel_ = clone(self.MetaModel)
kf = KFold(n_splits = self.nfolds, shuffle = False, random_state = 6)
out_of_fold_preds = np.zeros((X.shape[0], len(self.BaseModels_)))
for index, model in enumerate(self.BaseModels_):
for train_index, out_of_fold_index in kf.split(X, y):
instance = clone(model)
self.BaseModels_[index].append(instance)
instance.fit(X[train_index], y[train_index])
preds = instance.predict(X[out_of_fold_index])
out_of_fold_preds[out_of_fold_index, index] = preds
#print(model, preds, out_of_fold_preds.shape)
self.MetaModel_.fit(out_of_fold_preds, y)
return self
我使用LogisticRegression,SGDClassifier,RandomForestClassifer作为我的基础模型,使用XGBoost作为我的元模型。 random_state存在于所有模型中,但仅适用于基本模型。
我收到错误“ init ()在random_state放入xgbclassifier时得到了一个意外的关键字参数'random_state'”。
请注意,我在调用类之前尝试初始化random_state。尝试改变KFold中的洗牌。另外,如何在类方法中初始化参数?
答案 0 :(得分:1)
从API看,xgbclassifier看起来像'种子'。
xgboost.XGBClassifier(max_depth=3, learning_rate=0.1, n_estimators=100, silent=True, objective='binary:logistic', booster='gbtree', n_jobs=1, nthread=None, gamma=0, min_child_weight=1, max_delta_step=0, subsample=1, colsample_bytree=1, colsample_bylevel=1, reg_alpha=0, reg_lambda=1, scale_pos_weight=1, base_score=0.5, random_state=0, seed=None, missing=None, **kwargs)
请问您为什么不设置类级种子并将其应用于所有方法?