我正在尝试对sklearn.ensemble.RandomForestRegressor
的{{1}}的结果进行排序
我具有以下功能:
feature_importances_
我这样使用它:
def get_feature_importances(cols, importances):
feats = {}
for feature, importance in zip(cols, importances):
feats[feature] = importance
importances = pd.DataFrame.from_dict(feats, orient='index').rename(columns={0: 'Gini-importance'})
importances.sort_values(by='Gini-importance')
return importances
我得到以下结果:
importances = get_feature_importances(X_test.columns, rf.feature_importances_)
print()
print(importances)
我认为| PART | 0.035034 |
| MONTH1 | 0.02507 |
| YEAR1 | 0.020075 |
| MONTH2 | 0.02321 |
| YEAR2 | 0.017861 |
| MONTH3 | 0.042606 |
| YEAR3 | 0.028508 |
| DAYS | 0.047603 |
| MEDIANDIFF | 0.037696 |
| F2 | 0.008783 |
| F1 | 0.015764 |
| F6 | 0.017933 |
| F4 | 0.017511 |
| F5 | 0.017799 |
| SS22 | 0.010521 |
| SS21 | 0.003896 |
| SS19 | 0.003894 |
| SS23 | 0.005249 |
| SS20 | 0.005127 |
| RR | 0.021626 |
| HI_HOURS | 0.067584 |
| OI_HOURS | 0.054369 |
| MI_HOURS | 0.062121 |
| PERFORMANCE_FACTOR | 0.033572 |
| PERFORMANCE_INDEX | 0.073884 |
| NUMPA | 0.022445 |
| BUMPA | 0.024192 |
| ELOH | 0.04386 |
| FFX1 | 0.128367 |
| FFX2 | 0.083839 |
行将对它们进行排序。但事实并非如此。为什么这不能正确执行?
答案 0 :(得分:2)
importances.sort_values(by='Gini-importance')
返回已排序的数据框,您的函数忽略了该数据框。
您想要return importances.sort_values(by='Gini-importance')
。
或者您可以就地设置sort_values
:
importances.sort_values(by='Gini-importance', inplace=True)
return importances