使用唯一值作为列重新整形数据

时间:2017-04-21 09:25:20

标签: r group-by reshape transpose

生成虚拟数据

MainID=c('A1','A1','B2','C1','C1','C1','D2','D2')
HouseholdID=c('Ab1','Ab1','cb2','Ca2','cb2','cb3','Da1','db2')
relation=c('Spouse','Spouse','Child','Spouse','Child','Mother','Brother','Spouse')

df=data.table(MainID,HouseholdID,relation)
head(df)

   MainID HouseholdID relation
1:     A1         Ab1   Spouse
2:     A1         Ab1   Spouse
3:     B2         cb2    Child
4:     C1         Ca2   Spouse
5:     C1         cb2    Child
6:     C1         cb3   Mother

我需要重新整理这些数据,如下所示:

期望的结果

MainID      Household1      Relation1       Household2         Relation2           Household3      Relation3
A1               Ab1           Spouse          NA                  NA                  NA              NA
B2               cb2           Child           NA                  NA                  NA              NA
C1               Ca2           Spouse          cb2                 Child               cb3            Mother
D2               Da1           Brother         db2                 Spouse              NA              NA      

使用dplyr , reshape , tidyverse或任何其他方法/包执行此操作的最佳方法是什么?

1 个答案:

答案 0 :(得分:0)

由于您已经在使用“data.table”,因此您只需获取唯一值,然后添加行指示符变量,最后添加dcast到宽格式:

library(data.table)
dcast(unique(df)[, ind := rowid(MainID)], 
      MainID ~ ind, value.var = c("HouseholdID", "relation"))
#    MainID HouseholdID_1 HouseholdID_2 HouseholdID_3 relation_1 relation_2 relation_3
# 1:     A1           Ab1            NA            NA     Spouse         NA         NA
# 2:     B2           cb2            NA            NA      Child         NA         NA
# 3:     C1           Ca2           cb2           cb3     Spouse      Child     Mother
# 4:     D2           Da1           db2            NA    Brother     Spouse         NA