如何强制rpart做1分裂

时间:2015-02-27 08:01:47

标签: r decision-tree rpart

遇到与this类似的问题,我试图强制rpart完成一次拆分。这是一个再现我的问题的玩具示例:

require(rpart)

y <- factor(c(1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0))
x1 <- c(12,18,15,10,10,10,20,6,7,34,7,11,10,22,4,19,10,8,13,6,7,47,6,15,7,7,21,7,8,10,15)
x2 <- c(318,356,341,189,308,236,290,635,550,287,261,472,282,262,1153,435,402,182,415,544,251,281,378,498,142,566,152,560,284,213,326)

data <- data.frame(y=y,x1=x1,x2=x2)
tree <-rpart(y~.,
              data=data,
              control=rpart.control(maxdepth=1, # at most 1 split
                                    cp=0, # any positive improvement will do
                                    minsplit=1,
                                    minbucket=1, # even leaves with 1 point are accepted
                                    xval=0)) # I don't need crossvalidation
length(tree$frame$var) #==1, so there are no splits

应该可以隔离单个点(minbucket = 1),即使是最边际的改进(隔离一个点总是会降低错误分类率)也应该导致保持分裂(cp = 0)。

为什么结果不包含任何拆分?我如何改变代码以始终获得一个分裂?如果两个分类到相同的因子输出,是不是可以保留分裂?

1 个答案:

答案 0 :(得分:2)

将cp = 0更改为cp = -1。

显然第一次分割(maxdepth = 3)的cp是0.0000000。所以去负值允许它出现maxdepth = 1。