将二项式数据重塑为长bernoulli格式

时间:2018-04-24 15:44:36

标签: r dataframe reshape rpart

我在一年后回到R并想使用rpart作为分类树。

我的数据如下:

Category, Shape, Color, Yes, No
A, Square, Blue, 3, 2
B, Triangle, Blue, 2, 4
etc. 

任何重塑到下面的建议所以我可以使用rpart? (我相信rpart需要这样的数据)

ID, Shape, Color, Result
A, Square, Blue, Yes
A, Square, Blue, Yes
A, Square, Blue, Yes
A, Square, Blue, No
A, Square, Blue, No
B, Triangle, Green, Yes
etc...

谢谢!

2 个答案:

答案 0 :(得分:2)

您可以使用melt中的reshape2,然后按rep

s=melt(df,id.var=c('Category','Shape','Color'))
s[ rep( 1:nrow(s) , s$value ),]
              Category     Shape Color variable value
1                    A    Square  Blue      Yes     3
1.1                  A    Square  Blue      Yes     3
1.2                  A    Square  Blue      Yes     3
2                    B  Triangle  Blue      Yes     2
2.1                  B  Triangle  Blue      Yes     2
3                    A    Square  Blue       No     2
3.1                  A    Square  Blue       No     2
4                    B  Triangle  Blue       No     4
4.1                  B  Triangle  Blue       No     4
4.2                  B  Triangle  Blue       No     4
4.3                  B  Triangle  Blue       No     4

答案 1 :(得分:1)

melt将数据转换为长格式,然后重复变量它们出现在值列中的次数。

library(data.table)
melt(setDT(dat),1:3)[,rep(variable,value),by=.(Category,Shape,Color)]
            Category     Shape Color  V1
 1:                A    Square  Blue Yes
 2:                A    Square  Blue Yes
 3:                A    Square  Blue Yes
 4:                A    Square  Blue  No
 5:                A    Square  Blue  No
 6:                B  Triangle  Blue Yes
 7:                B  Triangle  Blue Yes
 8:                B  Triangle  Blue  No
 9:                B  Triangle  Blue  No
10:                B  Triangle  Blue  No
11:                B  Triangle  Blue  No

使用:

库(tidyverse)

dat%>%
  rowwise()%>%
  mutate(var=list(rep(c("Yes","No"),c(Yes,No))))%>%
  select(-Yes,-No)%>%
  unnest()
 Category   Shape    Color var  
  <fct>    <fct>    <fct> <chr>
 1 A        Square   Blue  Yes  
 2 A        Square   Blue  Yes  
 3 A        Square   Blue  Yes  
 4 A        Square   Blue  No   
 5 A        Square   Blue  No   
 6 B        Triangle Blue  Yes  
 7 B        Triangle Blue  Yes  
 8 B        Triangle Blue  No   
 9 B        Triangle Blue  No   
10 B        Triangle Blue  No   
11 B        Triangle Blue  No