我有一个长格式的数据框,我想转换成宽格式。数据框有几个重复的标识符,我想将它们视为唯一实例,并将它们表示为宽数据框中的单独行。
我的问题与此类似:
Forcing unique values before casting (pivoting) in R
但是在上面的问题中,唯一条目最终成为单独的列。对于我的问题,我希望将数据放入单独的行中。例如:
ID1<-c("A","A","A","A","A","B","B","B","B","B","C","C","C","C","C")
ID2<-c("R","R","R","L","L","R","R","L","L","R","R","L","L","R","R")
Sp<-c("Bird","Cat","Bird","Bird","Dog","Dog","Dog","Cat","Cat","Bird","Cat","Dog","Bird","Bird","Cat")
Count<-c(1,2,2,1,2,1,2,3,2,1,2,3,2,1,5)
DF<-data.frame(ID1,ID2,Sp,Count)
将数据转换为宽格式后,我希望输出数据如下所示:
ID1 ID2 Bird Cat Dog
A R 1 2 0
A R 2 0 0 # 2 Birds in the A/ R combination so need second row (don't want to add them together)
A L 1 0 2
B R 1 0 1
B R 0 0 2
B L 0 3 0
B L 0 2 0
C R 1 2 0
C R 0 5 0
C L 2 0 3
如果唯一ID1 / ID2组合中没有重复,则演员表格正常工作。但是如果有重复,则会创建第二行(或第三行或第四行)。
答案 0 :(得分:1)
您可以为ID1
,ID2
和Sp
每组创建一个辅助ID列,然后重新塑造ID1
,ID2
和{{1 }作为id列:
AUXID
您可以删除library(dplyr)
DF = DF %>% group_by(ID1, ID2, Sp) %>% mutate(AUXID = row_number()) %>% as.data.frame()
reshape(DF, idvar = c("ID1", "ID2", "AUXID"), timevar = "Sp", dir = "wide")
# ID1 ID2 AUXID Count.Bird Count.Cat Count.Dog
# 1 A R 1 1 2 NA
# 3 A R 2 2 NA NA
# 4 A L 1 1 NA 2
# 6 B R 1 1 NA 1
# 7 B R 2 NA NA 2
# 8 B L 1 NA 3 NA
# 9 B L 2 NA 2 NA
# 11 C R 1 1 2 NA
# 12 C L 1 2 NA 3
# 15 C R 2 NA 5 NA
列,然后填写AUXID
。
以下是包含NA
的data.table版本,该版本提供了dcast()
参数来填充NA值:
fill