Question

我正在使用随机森林分类器对数据集建模。我想打印随机森林选择的功能。我使用feature_importances_如下：

modelRF.feature_importances_

但它显示的错误为：

NameError：名称'feature_importances_'未定义

同样在使用“fit”方法时，它将错误提供为：

AttributeError：'RandomForest'对象没有属性'fit'

以下是随机森林分类器中使用的参数：

(data, x_cols, y_col, num_trees, method, impurity, max_depth=10, min_instance_per_node=20, min_information_gain=0.01, max_bin=32, feature_subset_strategy=u'auto', seed=123, async_execution=False)

我想打印使用随机森林选择的功能。

是否需要定义一些额外的东西来使上述方法适用于随机林？（我使用adatao / arimo包在分布式平台中建模RF）。

Answer 1

arimo包中有一个名为variable_importance的模块，它将为您提供随机林分类器选择的功能。

它会给出一个带有variable name, importance score

的pandas数据帧

variable name有importance score＆gt; 0.0是随机森林分类器选择的特征。这可以在python中用于分布式平台的arimo包。

model.feature_importances_

可以用于其他包。

如何在随机森林中建模后打印特征？

1 个答案: