在方差上聚合数据框

时间:2017-06-26 18:06:33

标签: r aggregate

说我有这个数据框,df,

       Day    value

1  2012-06-10   552
2  2012-06-10  4850
3  2012-06-11  4642
4  2012-06-11  4132
5  2012-06-11  4190
6  2012-06-12  4186
7  2012-06-13  1139
8  2012-06-13   490
9  2012-06-13  5156
10 2012-06-13  4430
11 2012-06-13  4447
12 2012-06-14  4256
13 2012-06-14  3856
14 2012-06-14  1163
15 2012-06-17   564
16 2012-06-17  4866
17 2012-06-17  4421
18 2012-06-19  4206
19 2012-06-20  4272
20 2012-06-20  3993
21 2012-06-20  1211
22 2012-07-21   698
23 2012-07-21  5770
24 2012-07-21  5103
25 2012-07-21   775
26 2012-07-21  5140
27 2012-07-22  4868

我想创建一个包含每日差异的data.frame,dfvar:类似于:

     Day       Variance

1  2012-06-10  9236402
2  2012-06-11   X
3  2012-06-12  4186
4  2012-06-13  1139
5  2012-06-14  4256
6  2012-06-17   564
7  2012-06-19  4206
8  2012-06-20  4272
9  2012-07-21   698
10 2012-07-22  4868

因此,例如,我计算了它,条目 dfvar$Variance[1] = var(c(552, 4850))

我试着做

dfvar <- aggregate(df, by = list(Day), FUN = var)

但这不是我预期的输入。我真的希望得到同一天价值的变化,而没有其他日子...... 关于那个的任何想法?

1 个答案:

答案 0 :(得分:0)

这是你想要的吗?

library(dplyr)
df%>%group_by(Day)%>%dplyr::summarise(Variance=var(value))#return NA if only one value within the group

          Day   Variance
       <fctr>      <dbl>
1  2012-06-10 9236402.00
2  2012-06-11   77961.33
3  2012-06-12         NA
4  2012-06-13 4615704.30
5  2012-06-14 2829816.33
6  2012-06-17 5596946.33
7  2012-06-19         NA
8  2012-06-20 2864514.33
9  2012-07-21 6422224.70
10 2012-07-22         NA