如何在R程序中重塑此数据帧

时间:2017-02-06 07:44:48

标签: r dataframe reshape

我有一个类似下面的数据框:

      A    B    C   
[1,] "A1" "B3" "C1"
[2,] "A2" "B1" "C2"
[3,] "A3" "B3" "C3"
[4,] "A1" "B2" "C3"
[5,] "A3" "B3" "C2"
[6,] "A1" "B1" "C1"

我想像这样重塑它,将变量的每个唯一值扩展为单个变量,并在值字段中标记1/0。以上数据框架应改为:

     A    B1    B2    B3    C1    C2    C3       
[1,] "A1" "0"   "0"   "1"   "1"   "0"   "0" 
[2,] "A2" "1"   "0"   "0"   "0"   "1"   "0" 
[3,] "A3" "0"   "0"   "1"   "0"   "0"   "1" 
[4,] "A1" "0"   "1"   "0"   "0"   "0"   "1" 
[5,] "A3" "0"   "0"   "1"   "0"   "1"   "0" 
[6,] "A1" "1"   "0"   "0"   "1"   "0"   "0"  

真正的数据量很大(每天大于10万,还有更多的字段和独特的价值。所以我需要一个高效的程序,而不是用于......

我相信你可以帮忙...我是初学者,只知道...... :(

2 个答案:

答案 0 :(得分:0)

您也可以尝试使用base R):

df <- cbind(as.character(df$A), model.matrix(~B+C+0,df,list(B=contrasts(df$B, contrasts=F), 
                                                      C=contrasts(df$C, contrasts=F))))
dimnames(df) <- list(NULL, c('A', paste0('B',1:3), paste0('C',1:3)))
df
 #      A    B1  B2  B3  C1  C2  C3 
 #[1,] "A1" "0" "0" "1" "1" "0" "0"
 #[2,] "A2" "1" "0" "0" "0" "1" "0"
 #[3,] "A3" "0" "0" "1" "0" "0" "1"
 #[4,] "A1" "0" "1" "0" "0" "0" "1"
 #[5,] "A3" "0" "0" "1" "0" "1" "0"
 #[6,] "A1" "1" "0" "0" "1" "0" "0"

答案 1 :(得分:-1)

我们可以使用

library(qdapTools)
cbind(df1[1],  mtabulate(as.data.frame(t(df1[-1]))))
#    A B3 C1 B1 C2 C3 B2
#V1 A1  1  1  0  0  0  0
#V2 A2  0  0  1  1  0  0
#V3 A3  1  0  0  0  1  0
#V4 A1  0  0  0  0  1  1
#V5 A3  1  0  0  1  0  0
#V6 A1  0  1  1  0  0  0