如何根据R中的日期值创建新值

时间:2018-01-08 09:01:03

标签: r date

我有一个像这样的大数据框:

df <- data.frame(id = c('1', '2', '3', '4', '5', '6'), Date = c("01-Feb-17", "05-Feb-17", "03-May-17","24-May-17","20-Oct-17", "25-Oct-17"), Name=c("John", "Jack", "Jack", "John", "John", "Jack"), Workout=c('150', '130', '140', '160', '150', '130'))

如何创建一个新值(Average_Workout),其中包含&#34; Workout&#34;的平均值。自年初以来的每个时期。

例如,

enter image description here

1 个答案:

答案 0 :(得分:3)

我们可以在按名称&#39;

分组后使用cummean
library(dplyr)
res <- df %>%
         #if not ordered by 'Date' 
         #arrange(Name, as.Date(Date, "%d-%b-%y")) %>%
         group_by(Name) %>%
         mutate(Avg = cummean(Workout))

as.data.frame(res)
#  id      Date Name Workout      Avg
#1  1 01-Feb-17 John     150 150.0000
#2  2 05-Feb-17 Jack     130 130.0000
#3  3 03-May-17 Jack     140 135.0000
#4  4 24-May-17 John     160 155.0000
#5  5 20-Oct-17 John     150 153.3333
#6  6 25-Oct-17 Jack     130 133.3333

注意:当我们引用numeric元素时,它将是characterfactor类,具体取决于stringAsFactors = FALSE还是TRUE

数据

df <- data.frame(id = c('1', '2', '3', '4', '5', '6'), 
  Date = c("01-Feb-17", "05-Feb-17", "03-May-17","24-May-17","20-Oct-17", "25-Oct-17"), 
  Name=c("John", "Jack", "Jack", "John", "John", "Jack"), 
  Workout=c(150, 130, 140, 160, 150, 130), stringsAsFactors = FALSE)