当我尝试使用以下代码时,我收到以下错误。
****** ******代码
importance = bst.get_fscore(fmap='xgb.fmap')
importance = sorted(importance.items(), key=operator.itemgetter(1))
****** ******错误
File "scripts/xgboost_bnp.py", line 225, in <module>
importance = bst.get_fscore(fmap='xgb.fmap')
File "/usr/lib/python2.7/site-packages/xgboost/core.py", line 754, in get_fscore
trees = self.get_dump(fmap)
File "/usr/lib/python2.7/site-packages/xgboost/core.py", line 740, in get_dump
ctypes.byref(sarr)))
File "/usr/lib/python2.7/site-packages/xgboost/core.py", line 92, in _check_call
raise XGBoostError(_LIB.XGBGetLastError())
xgboost.core.XGBoostError: can not open file "xgb.fmap"
答案 0 :(得分:5)
引发错误的原因是您使用可选参数get_fscore
调用fmap
,指出应从名为xgb.fmap
的功能映射文件中提取每个功能的功能重要性,该文件不会存在于您的文件系统中。
这是一个返回已排序的要素名称及其重要性的函数:
import xgboost as xgb
import pandas as pd
def get_xgb_feat_importances(clf):
if isinstance(clf, xgb.XGBModel):
# clf has been created by calling
# xgb.XGBClassifier.fit() or xgb.XGBRegressor().fit()
fscore = clf.booster().get_fscore()
else:
# clf has been created by calling xgb.train.
# Thus, clf is an instance of xgb.Booster.
fscore = clf.get_fscore()
feat_importances = []
for ft, score in fscore.iteritems():
feat_importances.append({'Feature': ft, 'Importance': score})
feat_importances = pd.DataFrame(feat_importances)
feat_importances = feat_importances.sort_values(
by='Importance', ascending=False).reset_index(drop=True)
# Divide the importances by the sum of all importances
# to get relative importances. By using relative importances
# the sum of all importances will equal to 1, i.e.,
# np.sum(feat_importances['importance']) == 1
feat_importances['Importance'] /= feat_importances['Importance'].sum()
# Print the most important features and their importances
print feat_importances.head()
return feat_importances