将每个编号的每日价值汇总为每周平均值

时间:2018-10-17 14:04:28

标签: r dplyr plyr xts lubridate

我有一个ID为ID的数据框,其中的ID包含多个独立的连续时间段中的值,现在我想创建一个列,该列是每日数据的每周平均值。

df
id   date      value
 1   2018-1-12 3
 1   2018-1-13 4
 1   2018-1-14 5
 1   2018-1-15 5
 1   2018-1-16 3
 1   2018-1-17 5
 1   2018-1-18 5
 1   2018-1-19 5
 2   2017-1-14 8
 .
 .
 .
 12  2016-12-10 7

我希望我的df是

df
id   date      value  mean_week
 1   2018-1-12 3      mean(7 consecutive days starting 2018-1-12 and id=1)
 1   2018-1-13 4      mean(7 consecutive days starting 2018-1-12 and id=1)
 1   2018-1-14 5      mean(7 consecutive days starting 2018-1-12 and id=1)
 1   2018-1-15 5      mean(7 consecutive days starting 2018-1-12 and id=1)
 1   2018-1-16 3      mean(7 consecutive days starting 2018-1-12 and id=1)
 1   2018-1-17 5      mean(7 consecutive days starting 2018-1-12 and id=1)
 1   2018-1-18 5      mean(7 consecutive days starting 2018-1-12 and id=1)
 1   2018-1-19 5      NA(since there is no consecutive seven days)
 2   2017-1-14 5      mean(7 consecutive days starting 2017-1-14 and id=2)
 .
 .
 .
 12  2016-12-10 7    NA(since there is no consecutive seven days)

我搜索了一种简单的方法,但到目前为止,我仅以循环方式进行操作。

2 个答案:

答案 0 :(得分:1)

类似的事情,但是我不了解星期开始的情况

library(tidyverse)
 df=read.table(text="id   date      value
  1   2018-1-12 3
               1   2018-1-13 4
               1   2018-1-14 5
               1   2018-1-16 3
               1   2018-1-17 5",header=T)

 library(lubridate)
 df%>%
   mutate(week=isoweek(date))%>%
   group_by(week,id)%>%
   mutate(mean_week=mean(value,na.rm = T))
# A tibble: 5 x 5
# Groups:   week, id [2]
     id date      value  week mean_week
  <int> <fct>     <int> <dbl>     <dbl>
1     1 2018-1-12     3    2.        4.
2     1 2018-1-13     4    2.        4.
3     1 2018-1-14     5    2.        4.
4     1 2018-1-16     3    3.        4.
5     1 2018-1-17     5    3.        4.

答案 1 :(得分:0)

汇总按周分组的数据。但是请使用mutate(),以便每一行都获得汇总值。

df <- data.frame(date = as.Date("2018-01-01")+1:100,
                 value = sample(1:10,size = 100,replace = TRUE))


require(dplyr)
require(lubridate)



df %>% mutate(week = week(date)) %>%
  group_by(week) %>%
  mutate(summary = paste(round(mean(value),1),"(",n()," consecutive days starting ",min(date),")"))

给予

date value  week                                           summary
<date> <int> <dbl>                                             <chr>
1  2018-01-02     3     1 4.7  ( 6  consecutive days starting  2018-01-02 )
2  2018-01-03     6     1 4.7  ( 6  consecutive days starting  2018-01-02 )
3  2018-01-04     1     1 4.7  ( 6  consecutive days starting  2018-01-02 )
4  2018-01-05     1     1 4.7  ( 6  consecutive days starting  2018-01-02 )
5  2018-01-06    10     1 4.7  ( 6  consecutive days starting  2018-01-02 )
6  2018-01-07     7     1 4.7  ( 6  consecutive days starting  2018-01-02 )
7  2018-01-08     2     2   4  ( 7  consecutive days starting  2018-01-08 )
8  2018-01-09     2     2   4  ( 7  consecutive days starting  2018-01-08 )
9  2018-01-10     5     2   4  ( 7  consecutive days starting  2018-01-08 )
10 2018-01-11     7     2   4  ( 7  consecutive days starting  2018-01-08 )