XGBRegressor:更改random_state无效

时间:2018-06-11 11:47:00

标签: python-3.x xgboost

尽管给出了新的随机种子,xgboost.XGBRegressor似乎产生了相同的结果。

根据xgboost文档xgboost.XGBRegressor

  

seed:int随机数种子。 (已弃用,请使用random_state)

     

random_state:int随机数种子。 (取代种子)

random_state是要使用的,但无论我使用的是random_state还是seed,模型都会产生相同的结果。一个Bug?

from xgboost import XGBRegressor
from sklearn.datasets import load_boston
import numpy as np
from itertools import product

def xgb_train_predict(random_state=0, seed=None):
    X, y = load_boston(return_X_y=True)
    xgb = XGBRegressor(random_state=random_state, seed=seed)
    xgb.fit(X, y)
    y_ = xgb.predict(X)
    return y_

check = xgb_train_predict()

random_state = [1, 42, 58, 69, 72]
seed = [None, 2, 24, 85, 96]

for r, s in product(random_state, seed):
    y_ = xgb_train_predict(r, s)
    assert np.equal(y_, check).all()
    print('CHECK! \t random_state: {} \t seed: {}'.format(r, s))

[Out]:
    CHECK!   random_state: 1     seed: None
    CHECK!   random_state: 1     seed: 2
    CHECK!   random_state: 1     seed: 24
    CHECK!   random_state: 1     seed: 85
    CHECK!   random_state: 1     seed: 96
    CHECK!   random_state: 42    seed: None
    CHECK!   random_state: 42    seed: 2
    CHECK!   random_state: 42    seed: 24
    CHECK!   random_state: 42    seed: 85
    CHECK!   random_state: 42    seed: 96
    CHECK!   random_state: 58    seed: None
    CHECK!   random_state: 58    seed: 2
    CHECK!   random_state: 58    seed: 24
    CHECK!   random_state: 58    seed: 85
    CHECK!   random_state: 58    seed: 96
    CHECK!   random_state: 69    seed: None
    CHECK!   random_state: 69    seed: 2
    CHECK!   random_state: 69    seed: 24
    CHECK!   random_state: 69    seed: 85
    CHECK!   random_state: 69    seed: 96
    CHECK!   random_state: 72    seed: None
    CHECK!   random_state: 72    seed: 2
    CHECK!   random_state: 72    seed: 24
    CHECK!   random_state: 72    seed: 85
    CHECK!   random_state: 72    seed: 96

1 个答案:

答案 0 :(得分:5)

似乎(在开始挖掘答案之前我自己都不知道:)),xgboost仅使用随机生成器进行子采样,请参阅this Laurae's comment on a similar github issue。否则行为是确定性的。

如果您使用过抽样,xgboost中当前sklearn API处理seed / random_state时会出现问题。 seed确实声称已被弃用,但似乎如果有人提供,它仍会在random_state上使用,如here in the code所示。仅当您有seed not None

时,此评论才有意义