ddply并将data.frame中的值作为函数的一部分求和

时间:2013-09-13 18:51:08

标签: r plyr

我有data.frame看起来与此相似:

Y       date    value1      value2
a 2013-01-01 28.857326   9.0206351
a 2013-01-02 13.675526   5.7823725
a 2013-01-03 20.115434   9.3267285
a 2013-01-04 -4.255547   0.9174301
a 2013-01-05 20.898522   9.7821027
b 2013-01-01  5.478783  27.0027194
b 2013-01-02 21.195939 -14.8786857
b 2013-01-03 -4.407236  18.9189197
b 2013-01-04 25.910805   1.0627444
b 2013-01-05 -2.511209  39.0908554

我想计算以下(value1 * value2)/ sum(value1),其中sum(value1)应该只是每个日期的求和值,例如:第一行数据应计算为: (28.857326 * 9.0206351)/(28.857326 + 5.478783)。

我试过了两个: ddply(x, .(date), summarize, freq=length(date), calc=(value1 * value2) / sum(value1))

ddply(x, .(date), summarize, calc=(value1 * value2) / sum(value1))

但是第一次通话时收到错误,第二次收到错误结果。

以下是生成虚拟数据的代码:

a <- rnorm(10, 10, 10)
b <- rnorm(10, 10, 10)
x <- data.frame(y=c(rep("a", times=5), rep("b", times=5)), date=c(seq(as.Date("2013-01-01"), as.Date("2013-01-05"), by="days")), value1=a, value2=b)

2 个答案:

答案 0 :(得分:2)

你的第二行有效并给出“预期”的结果。第一个失败是因为length(date)的结果是长度为2的向量而不是单个值。由于您希望data.frame的每一行都有结果,因此您应该使用transform而不是summarise

ddply(x, .(date), transform, freq=length(date), calc=(value1 * value2) / sum(value1))

   y       date     value1    value2 freq      calc
1  a 2013-01-01  8.0886946 -4.498656    2 -2.376917
2  b 2013-01-01  7.2203152  1.222322    2  0.576494
3  a 2013-01-02  7.9971361 -5.675020    2 -1.757606
4  b 2013-01-02 17.8242945 26.489059    2 18.285152
5  a 2013-01-03  3.0401349 10.495623    2  1.283746
6  b 2013-01-03 21.8153403 14.648083    2 12.856439
7  a 2013-01-04 14.4831518 -2.812941    2 -2.685447
8  b 2013-01-04  0.6875999 27.397730    2  1.241776
9  a 2013-01-05  6.2625381 19.979980    2  8.386698
10 b 2013-01-05  8.6569681 11.385124    2  6.606161

答案 1 :(得分:2)

使用data.table

library(data.table)
x<-data.table(x)
x[,list(freq=length(date),cal=(value1*value2)/sum(value1)),keyby="date"]
          date freq         cal
 1: 2013-01-01    1 -3.94483543
 2: 2013-01-01    1 10.83779796
 3: 2013-01-02    1  2.33439622
 4: 2013-01-02    1 10.62941740
 5: 2013-01-03    1  2.97776304
 6: 2013-01-03    1  0.06035661
 7: 2013-01-04    1  1.59372587
 8: 2013-01-04    1  7.17029644
 9: 2013-01-05    1 -0.64156778
10: 2013-01-05    1 -1.23650898