rpart - rpart的神秘参数?导致在1级没有分裂(而2级分裂在2级完成!!)

时间:2016-10-15 15:08:30

标签: r rpart

我正在做一个关于如何使用rpart创建回归树的简单测试,我用手工制作的数据发现了R的一个令人惊讶的行为:
- 当使用maxdepth = 1生长树时 - >没有拆分!
- 当使用maxdepth = 2生长树时 - >完成2次分裂!

为什么没有使用maxdepth = 1进行拆分?我猜rpart函数的某些参数是"阻止增长",但是哪一个?

以下是使用的数据:

enter code here
# generate some data
set.seed(1234)
x <- runif(200, min=0, max=1)
y <- runif(200, min=0, max=1)
mydf <- cbind.data.frame(x, y)
mydf <- mydf%>%mutate(target = ifelse(
  ((x>0.2)&(x<0.5) | (x>0.7)&(x<0.9)) & (y>0.1)&(y<0.8), 1, 0))

# to look at data that was generated
plot(mydf$x, mydf$y, 
     main = "Observations (red triangles stand for Target = 1)", #title
     col = mydf$target + 1, #colours defined by an integer (in that case 1 or 2)
     pch = 16 + mydf$target)
abline(v = c(0.2, 0.5, 0.7, 0.9), lty = 2, col = "grey") 
abline(h = c(0.1, 0.8), lty = 2, col = "grey") 

# grow a tree
mydf$target_factor <- as.factor(ifelse(mydf$target == 1, "success", "failure"))
predictors <- c("x", "y")
predictors <- paste(predictors,collapse = "+")
formula <- paste("target_factor",predictors,sep="~")
formula <- as.formula(formula)

myregressiontree <- rpart(formula, data = mydf, control = rpart.control(maxdepth = 1))
print(myregressiontree)

0 个答案:

没有答案