在R中使用来自reshape2库的dcast聚合数据(总和)
library('reshape2')
DF <- data.frame(Cohort=rep("", 9), Weeks=rep("", 9), myvalue=rep(0, 9), stringsAsFactors=FALSE )
DF[1, ] <- c("2012_30","0",0.02)
DF[2, ] <- c("2012_30","1",0.01)
DF[3, ] <- c("2012_30","2",0.1)
DF[4, ] <- c("2012_31","0",0.08)
DF[5, ] <- c("2012_31","1",0.0)
DF[6, ] <- c("2012_31","2",0.3)
DF[7, ] <- c("2012_32","0",0.26)
DF[8, ] <- c("2012_32","1",0.01)
DF[9, ] <- c("2012_32","2",0.01)
dcast(DF, Cohort ~ Weeks, value = myvalue, fill='')
这会产生
Cohort 0 1 2
2012_30 0.02 0.01 0.10
2012_31 0.08 0.00 0.30
2012_32 0.26 0.01 0.01
...
我希望能够得到这样的累积值:
Cohort 0 1 2
2012_30 0.02 0.03 0.13
2012_31 0.08 0.00 0.38
2012_32 0.26 0.27 0.28
...
我试图在reshape2和the original reshape paper
的dcast文档中找到fun.aggregate的示例答案 0 :(得分:2)
尝试ddply
作为临时步骤:
DF$myvalue <- as.numeric( DF$myvalue )
library( plyr )
DF <- ddply( DF, .(Cohort), transform, CUMSUM = cumsum( myvalue ) )
DF <- dcast( DF, Cohort ~ Weeks, value = CUMSUM, fill='' )
DF
Cohort 0 1 2
1 2012_30 0.02 0.03 0.13
2 2012_31 0.08 0.08 0.38
3 2012_32 0.26 0.27 0.28