xgboost.XGBRegressor
似乎产生了相同的结果。
根据xgboost
文档xgboost.XGBRegressor
:
seed:int随机数种子。 (已弃用,请使用random_state)
random_state:int随机数种子。 (取代种子)
random_state
是要使用的,但无论我使用的是random_state
还是seed
,模型都会产生相同的结果。一个Bug?
from xgboost import XGBRegressor
from sklearn.datasets import load_boston
import numpy as np
from itertools import product
def xgb_train_predict(random_state=0, seed=None):
X, y = load_boston(return_X_y=True)
xgb = XGBRegressor(random_state=random_state, seed=seed)
xgb.fit(X, y)
y_ = xgb.predict(X)
return y_
check = xgb_train_predict()
random_state = [1, 42, 58, 69, 72]
seed = [None, 2, 24, 85, 96]
for r, s in product(random_state, seed):
y_ = xgb_train_predict(r, s)
assert np.equal(y_, check).all()
print('CHECK! \t random_state: {} \t seed: {}'.format(r, s))
[Out]:
CHECK! random_state: 1 seed: None
CHECK! random_state: 1 seed: 2
CHECK! random_state: 1 seed: 24
CHECK! random_state: 1 seed: 85
CHECK! random_state: 1 seed: 96
CHECK! random_state: 42 seed: None
CHECK! random_state: 42 seed: 2
CHECK! random_state: 42 seed: 24
CHECK! random_state: 42 seed: 85
CHECK! random_state: 42 seed: 96
CHECK! random_state: 58 seed: None
CHECK! random_state: 58 seed: 2
CHECK! random_state: 58 seed: 24
CHECK! random_state: 58 seed: 85
CHECK! random_state: 58 seed: 96
CHECK! random_state: 69 seed: None
CHECK! random_state: 69 seed: 2
CHECK! random_state: 69 seed: 24
CHECK! random_state: 69 seed: 85
CHECK! random_state: 69 seed: 96
CHECK! random_state: 72 seed: None
CHECK! random_state: 72 seed: 2
CHECK! random_state: 72 seed: 24
CHECK! random_state: 72 seed: 85
CHECK! random_state: 72 seed: 96
答案 0 :(得分:5)
似乎(在开始挖掘答案之前我自己都不知道:)),xgboost仅使用随机生成器进行子采样,请参阅this Laurae's comment on a similar github issue。否则行为是确定性的。
如果您使用过抽样,xgboost中当前sklearn API处理seed
/ random_state
时会出现问题。 seed
确实声称已被弃用,但似乎如果有人提供,它仍会在random_state
上使用,如here in the code所示。仅当您有seed not None