Question

我尝试实现RPART，以便以后进行一些开发。到目前为止，仅适用于回归（ANOVA）模型。除了一件事之外，一切看起来都非常干净-RPART如何在具有相同改进的多个预测变量之间选择最佳划分。

例如，对于初始拆分，我有三个预测变量，它们给出相同的结果（相同的改进，相同的拆分，彼此的完美替代）—例如X310，X312和X317 。 RPART默认选择X312，但它不是列序列中的第一个预测变量。如果我置换列，RPART将选择X312或X317，但不会选择X310。

以下是选择X312时的摘要示例：

Node number 1: 100 observations, complexity param=0.7123717
mean=0.5155042, MSE=0.08350028
left son=2 (47 obs) right son=3 (53 obs)
Primary splits:
      X312 < 0.03673   to the left,  improve=0.7123717, (0 missing)
      X317 < 0.0187715 to the left,  improve=0.7123717, (0 missing)
      X310 < 0.0440585 to the left,  improve=0.7123717, (0 missing)
      X318 < 0.0167545 to the left,  improve=0.7123435, (0 missing)
      X323 < 0.0101715 to the left,  improve=0.7092180, (0 missing)

当它选择X317时：

Node number 1: 100 observations,    complexity param=0.7123717
  mean=0.5155042, MSE=0.08350028
  left son=2 (47 obs) right son=3 (53 obs)
  Primary splits:
      X317 < 0.0187715 to the left,  improve=0.7123717, (0 missing)
      X312 < 0.03673   to the left,  improve=0.7123717, (0 missing)
      X310 < 0.0440585 to the left,  improve=0.7123717, (0 missing)
      X318 < 0.0167545 to the left,  improve=0.7123435, (0 missing)
      X323 < 0.0101715 to the left,  improve=0.7092180, (0 missing)

再一次，一切都是相同的。我试图查看RPART的C代码，但找不到任何其他检查。对于任何想法都会非常感谢。

从预测变量中进行选择，并进行同样的改进

0 个答案: