我试图获取数据帧中组(“a”和“b”)的变量(v)的累积和。如何将结果在底部 - 其行的编号正确 - 得到我的数据帧的列cs?
> library(nlme)
> g <- factor(c("a","b","a","b","a","b","a","b","a","b","a","b"))
> v <- c(1,4,1,4,1,4,2,8,2,8,2,8)
> cs <- rep(0,12)
> d <- data.frame(g,v,cs)
> d
g v cs
1 a 1 0
2 b 4 0
3 a 1 0
4 b 4 0
5 a 1 0
6 b 4 0
7 a 2 0
8 b 8 0
9 a 2 0
10 b 8 0
11 a 2 0
12 b 8 0
> r=gapply(d,FUN="cumsum",form=~g, which="v")
>r
$a
v
1 1
3 2
5 3
7 5
9 7
11 9
$b
v
2 4
4 8
6 12
8 20
10 28
12 36
> str(r)
List of 2
$ a:'data.frame': 6 obs. of 1 variable:
..$ v: num [1:6] 1 2 3 5 7 9
$ b:'data.frame': 6 obs. of 1 variable:
..$ v: num [1:6] 4 8 12 20 28 36
我想我可以找出一些费力的方法将这些数据帧中的数据转换为d $ cs,但是我必须进行一些简单的调整。
答案 0 :(得分:13)
split<-
是一个非常奇怪的野兽
split(d$cs, d$g) <- lapply(split(d$v, d$g), cumsum)
导致
> d
g v cs
1 a 1 1
2 b 4 4
3 a 1 2
4 b 4 8
5 a 1 3
6 b 4 12
7 a 2 5
8 b 8 20
9 a 2 7
10 b 8 28
11 a 2 9
12 b 8 36
答案 1 :(得分:10)
我会使用ave
。如果你看一下ave
的来源,你会发现它基本上包裹了马丁摩根的solution。
R> g <- factor(c("a","b","a","b","a","b","a","b","a","b","a","b"))
R> v <- c(1,4,1,4,1,4,2,8,2,8,2,8)
R> d <- data.frame(g,v)
R> d$cs <- ave(v, g, FUN=cumsum)
R> d
g v cs
1 a 1 1
2 b 4 4
3 a 1 2
4 b 4 8
5 a 1 3
6 b 4 12
7 a 2 5
8 b 8 20
9 a 2 7
10 b 8 28
11 a 2 9
12 b 8 36
答案 2 :(得分:7)
我选择的工具是 plyr 包:
require(plyr)
> ddply(d,.(g),transform,cs = cumsum(v))
g v cs
1 a 1 1
2 a 1 2
3 a 1 3
4 a 2 5
5 a 2 7
6 a 2 9
7 b 4 4
8 b 4 8
9 b 4 12
10 b 8 20
11 b 8 28
12 b 8 36
答案 3 :(得分:0)
> library(nlme)
> g <- factor(c("a","b","a","b","a","b","a","b","a","b","a","b"))
> v <- c(1,4,1,4,1,4,2,8,2,8,2,8)
> cs <- rep(0,12)
> d <- data.frame(g,v,cs)
> d <- d[order(d$g),]
> temp <- by(d$v,d$g,cumsum)
> d$cs <- do.call("c",temp)
> d
g v cs
1 a 1 1
3 a 1 2
5 a 1 3
7 a 2 5
9 a 2 7
11 a 2 9
2 b 4 4
4 b 4 8
6 b 4 12
8 b 8 20
10 b 8 28
12 b 8 36
使用by函数的另一种解决方案,但我必须先订购数据