从长格式转换为宽格式,每次重复都会创建一个新行

时间:2016-10-27 20:23:57

标签: r casting reshape melt

我有一个长格式的数据框,我想转换成宽格式。数据框有几个重复的标识符,我想将它们视为唯一实例,并将它们表示为宽数据框中的单独行。

我的问题与此类似:

Forcing unique values before casting (pivoting) in R

但是在上面的问题中,唯一条目最终成为单独的列。对于我的问题,我希望将数据放入单独的行中。例如:

ID1<-c("A","A","A","A","A","B","B","B","B","B","C","C","C","C","C")

ID2<-c("R","R","R","L","L","R","R","L","L","R","R","L","L","R","R")

Sp<-c("Bird","Cat","Bird","Bird","Dog","Dog","Dog","Cat","Cat","Bird","Cat","Dog","Bird","Bird","Cat")

Count<-c(1,2,2,1,2,1,2,3,2,1,2,3,2,1,5)

DF<-data.frame(ID1,ID2,Sp,Count)

将数据转换为宽格式后,我希望输出数据如下所示:

ID1    ID2    Bird  Cat  Dog
A      R       1     2    0
A      R       2     0    0 # 2 Birds in the A/ R combination so need second row (don't want to add them together)
A      L       1     0    2
B      R       1     0    1
B      R       0     0    2
B      L       0     3    0
B      L       0     2    0
C      R       1     2    0
C      R       0     5    0
C      L       2     0    3

如果唯一ID1 / ID2组合中没有重复,则演员表格正常工作。但是如果有重复,则会创建第二行(或第三行或第四行)。

1 个答案:

答案 0 :(得分:1)

您可以为ID1ID2Sp每组创建一个辅助ID列,然后重新塑造ID1ID2和{{1 }作为id列:

AUXID

您可以删除library(dplyr) DF = DF %>% group_by(ID1, ID2, Sp) %>% mutate(AUXID = row_number()) %>% as.data.frame() reshape(DF, idvar = c("ID1", "ID2", "AUXID"), timevar = "Sp", dir = "wide") # ID1 ID2 AUXID Count.Bird Count.Cat Count.Dog # 1 A R 1 1 2 NA # 3 A R 2 2 NA NA # 4 A L 1 1 NA 2 # 6 B R 1 1 NA 1 # 7 B R 2 NA NA 2 # 8 B L 1 NA 3 NA # 9 B L 2 NA 2 NA # 11 C R 1 1 2 NA # 12 C L 1 2 NA 3 # 15 C R 2 NA 5 NA 列,然后填写AUXID

以下是包含NA的data.table版本,该版本提供了dcast()参数来填充NA值:

fill