防止printcp打印?

时间:2014-07-31 16:53:55

标签: r

我有一些代码,我适合树,然后通过选择复杂性参数自动修剪树,以便最小化交叉验证错误,如printcp()函数所示。在消化我的控制台输出时,我对printcp()打印出的质量感到恼火。

我所做的是将printcp()函数的输出转换为数据帧,然后使用一些逻辑来提取最低CV错误的CP值。无论如何我可以这样做,没有将printcp的输出打印到控制台?

  df_tree_1 <- rpart(formula(df_lm_2), cp = 0.0001, data = train)
  cp_df <- data.frame(printcp(df_tree_1))
  df_tree_1 <- prune.rpart(tree = df_tree_1, cp = cp_df$CP[which(cp_df$xerror == min(cp_df$xerror))])

1 个答案:

答案 0 :(得分:1)

您的rpart() - 拟合树对象包含&#34; cptable&#34;包含您要查找的值的表格。 printcp()函数只显示此表,因此您真正想要做的只是在运行prune()时动态返回值。以下是您如何做到这一点:

library(rpart)  # for the rpart function
library(rattle) # for "weather" dataset and for "fancy" tree plotter

# fit model using rpart
fit <- rpart(RainTomorrow ~ Rainfall + Evaporation + Sunshine + WindGustDir, 
             data = weather,
             method = "class")

# visualize with rattle
fancyRpartPlot(fit)

# prune by returning the value in the column of fit$cptable (a table)
# corresponding to the row that has the minimum "xerror" value
fit_autoprune <- prune(tree = fit,
                       cp = fit$cptable[which.min(fit$cptable[, "xerror"]),
                                        "CP"])

# visualize again to see difference
fancyRpartPlot(fit_autoprune)