我有这些数据:
thedat <- structure(list(id = c(" w12", " w12", " w12",
" w11", " w3", " w3", " w12",
" w45", " w24", " w24", " w24", " w08",
" w3", " w3", " w11"), time = structure(c(1559329080,
1559363580, 1559416140, 1559329380, 1559278020, 1559413920, 1559285100,
1559322660, 1559417460, 1559450220, 1559500980, 1559500980, 1559278020,
1559413920, 1559276700), class = c("POSIXct", "POSIXt"), tzone = ""),
x = c(28.03333, 18.45, 3.85, 27.95, 42.216667, 4.466667,
64.25, 53.81667, 27.483333, 18.383333, 4.283333, 4.28333336,
66.21667, 28.46667, 66.58333)), .Names = c("id", "time",
"x"), class = "data.frame", row.names = c(NA, -15L))
对于每个id
,我想得到x
的累计总和。所以对id = w11
我会得到:
w11, 2019-05-31 05:25:00, 66.58333,
w11, 2019-05-31 20:03:00, 94.48333
我试过
ddply(thedat, .(id), summarise,
time = unique(time),
answer = cumsum(x))
但这并没有让我得到我想要的东西。任何帮助表示感谢。
答案 0 :(得分:4)
问题是id
字符串中的空格数量不同。删除它们:
thedat$id <- gsub(" ", "", thedat$id)
thedat <- thedat[order(thedat$time),]
如果计算累积总和,使用transform
代替summarise
似乎更明智:
library(plyr)
ddply(thedat, .(id), transform,
answer = cumsum(x))
id time x answer
1 w08 2019-06-02 20:43:00 4.283333 4.283333
2 w11 2019-05-31 06:25:00 66.583330 66.583330
3 w11 2019-05-31 21:03:00 27.950000 94.533330
4 w12 2019-05-31 08:45:00 64.250000 64.250000
5 w12 2019-05-31 20:58:00 28.033330 92.283330
6 w12 2019-06-01 06:33:00 18.450000 110.733330
7 w12 2019-06-01 21:09:00 3.850000 114.583330
8 w24 2019-06-01 21:31:00 27.483333 27.483333
9 w24 2019-06-02 06:37:00 18.383333 45.866666
10 w24 2019-06-02 20:43:00 4.283333 50.149999
11 w3 2019-05-31 06:47:00 42.216667 42.216667
12 w3 2019-05-31 06:47:00 66.216670 108.433337
13 w3 2019-06-01 20:32:00 4.466667 112.900004
14 w3 2019-06-01 20:32:00 28.466670 141.366674
15 w45 2019-05-31 19:11:00 53.816670 53.816670