Question

所有！

有人可以给我一些关于Python中随机森林实现的建议吗？理想情况下，我需要输出尽可能多的关于分类器的信息，特别是：

来自火车组的哪些向量用于训练每个决定树木
在每个节点中的每个节点中随机选择哪些特征树，来自训练集的样本最终在此节点中，哪个选择要用于拆分的特征以及用于哪个特征的特征分割

我发现了一些实现，最着名的一个可能来自scikit，但目前尚不清楚如何做（1）和（2）那里（见this问题）。其他实现似乎有相同的问题，除了openCV中的问题，但它是在C ++中（python接口不包括随机森林的所有方法）。

有人知道满足（1）和（2）的东西吗？或者，任何想法如何改进scikit实现以获得功能（1）和（2）？

已解决：检查了sklearn.tree._tree.Tree的源代码。它有很好的评论（完全描述树）：

 children_left : int*
    children_left[i] holds the node id of the left child of node i.
    For leaves, children_left[i] == TREE_LEAF. Otherwise,
    children_left[i] > i. This child handles the case where
    X[:, feature[i]] <= threshold[i].

children_right : int*
    children_right[i] holds the node id of the right child of node i.
    For leaves, children_right[i] == TREE_LEAF. Otherwise,
    children_right[i] > i. This child handles the case where
    X[:, feature[i]] > threshold[i].

feature : int*
    feature[i] holds the feature to split on, for the internal node i.

threshold : double*
    threshold[i] holds the threshold for the internal node i.

Answer 1

你可以获得scikit-learn中的几乎所有信息。究竟是什么问题？您甚至可以使用点来显示树木。我不认为您可以找出随机抽样的哪个分割候选者，但您可以找出最终选择哪个。编辑：查看the decision tree的tree_属性。我同意，它没有很好的记录。确实应该有一个可视化叶子分布等的示例。您可以查看可视化函数，以了解如何获取属性。

Python中的随机森林实现

1 个答案: