我一直在克服XGBClassifier的这种奇怪行为,这应该像RandomForestClassifier一样表现得很好:
import xgboost as xgb
from sklearn.ensemble import RandomForestClassifier
class my_rf(RandomForestClassifier):
def important_features(self, X):
return super(RandomForestClassifier, self).feature_importances_
class my_xgb(xgb.XGBClassifier):
def important_features(self, X):
return super(xgb.XGBClassifier, self).feature_importances_
c1 = my_rf()
c1.fit(X,y)
c1.important_features(X) #works
此代码失败:(
c2 = my_xgb()
c2.fit(X,y)
c2.important_features(X) #fails with AttributeError: 'super' object has no attribute 'feature_importances_'
我盯着两个代码位,它们看起来和我一样!我错过了什么? 对不起,如果这是noob,python OOP的奥秘就在我身边。
编辑:
如果我使用vanilla xgb,没有继承,一切都很好用:
import xgboost as xgb
print "version:", xgb.__version__
c = xgb.XGBClassifier()
c.fit(X_train.as_matrix(), y_train.label)
print c.feature_importances_[:5]
version: 0.4
[ 0.4039548 0.05932203 0.06779661 0.00847458 0. ]
答案 0 :(得分:1)
据我所知,feature_importances_
未在XGBoost中实施。你可以使用像排列特征重要性这样的东西来推动你自己:
import random
from sklearn.cross_validation import cross_val_score
def feature_importances(clf, X, y):
score = np.mean(cross_val_score(clf, X,y,scoring='roc_auc'))
importances = {}
for i in range(X.shape[1]):
X_perm = X.copy()
X_perm[:,i] = random.sample(X[:,i].tolist(), X.shape[0])
perm_score = np.mean(cross_val_score(clf, X_perm , y, scoring='roc_auc'))
importances[i] = score - perm_score
return importances
答案 1 :(得分:1)
输出显示代码版本为0.4,repository tree of last stable version of 0.4x(已发布Jan 15, 2016
)显示sklearn.py文件尚未feature_importances_
。
此功能实际上是在Feb 8, 2016
上的this提交中引入的。
我克隆了当前的github存储库,从头开始构建并安装了xgboost
,代码工作正常:
from sklearn import datasets
from sklearn.ensemble.forest import RandomForestClassifier
import xgboost as xgb
print "version:", xgb.__version__
class my_rf(RandomForestClassifier):
def important_features(self, X):
return super(RandomForestClassifier, self).feature_importances_
class my_xgb(xgb.XGBClassifier):
def important_features(self, X):
return super(xgb.XGBClassifier, self).feature_importances_
iris = datasets.load_iris()
X = iris.data
y = iris.target
c1 = my_rf()
c1.fit(X,y)
print c1.important_features(X)
c2 = my_xgb()
c2.fit(X,y)
print c2.important_features(X)
c3 = xgb.XGBClassifier()
c3.fit(X, y)
print c3.feature_importances_
输出:
version: 0.6
[ 0.11834481 0.02627218 0.57008797 0.28529505]
[ 0.17701453 0.11228534 0.41479525 0.29590487]
[ 0.17701453 0.11228534 0.41479525 0.29590487]
如果您正在使用XGBRegressor
,请确保在Dec 1, 2016
之后克隆了存储库,因为根据this提交,feature_importances_
移动到base XGBModel
时XGBRegressor
访问class my_xgb_regressor(xgb.XGBRegressor):
def important_features(self, X):
return super(xgb.XGBRegressor, self).feature_importances_
c4 = my_xgb_regressor()
c4.fit(X, y)
print c4.important_features(X)
。
将此添加到上面的代码中:
version: 0.6
[ 0.0307026 0.01456868 0.45198349 0.50274523]
[ 0.17701453 0.11228534 0.41479525 0.29590487]
[ 0.17701453 0.11228534 0.41479525 0.29590487]
[ 0.25 0.17518248 0.34489051 0.229927 ]
输出:
export default function clientMiddleware(client) {
return ({ dispatch, getState }) => {
return next => (action) => {
if (typeof action === 'function') {
return action(dispatch, getState);
}
const { promise, ...rest } = action;
if (!promise) {
return next(action);
}
next({ ...rest });
const actionPromise = promise(client);
actionPromise.then(
result => next({ ...rest, result }),
error => next({ ...rest, error }),
).catch((error) => {
console.error('MIDDLEWARE ERROR:', error);
next({ ...rest, error });
});
return actionPromise;
};
};
}