按因子分割的R数据帧然后应用和tidyr

时间:2017-03-21 04:58:55

标签: r dataframe split apply tidyr

有8周的互联网实验。每个参与者都可以收集数据,他们可以在任何日期开始实验。这个想法是计算每个参与者在第一周,第二周所做的练习,依此类推。所以结果应该是参与者时间8矩阵/数据框架。

  • 每个参与者可以在任何日期开始,但实验在8周后关闭
  • 每个partisipant可以做他们想要的练习。

这里有一个例子



df <- data.frame(
        fac=c("a","a","a","a","a","b","b","b","b","b","c","c","c","c","c","d","d","d","d","d","d"), 
        date=c("2017-01-01","2017-01-05","2017-01-13","2017-01-25","2017-02-10","2017-01-06","2017-01-16","2017-01-28","2017-02-02","2017-02-07","2017-01-11","2017-01-19","2017-01-24","2017-01-31","2017-02-09","2017-01-12","2017-01-24","2017-01-29","2017-02-04","2017-02-19","2017-03-08"), 
        sessions=c(1,2,3,6,5,1,3,2,3,3,1,5,3,2,4,1,3,5,2,6,6)
        )
&#13;
&#13;
&#13;

我的想法是:

  1. 添加&#34; 0&#34;列(df $ count&lt; -0)
  2. 按因子分割数据框[split(df,df $ fac)] 3
  3. 取日期值 - 减去作为第一个条目的日期值,加1,除以7,然后向上舍入。 [roundup((date2 -date $ 1 $ + 1)/ 7)] 。这准确地给出了参与者进行练习的周数。
  4. with tidyr:重组整个数据框,使每周的值相加(参与者时间为8个数据框)
  5. 但我不知道如何正确实施第3步并与第4步结合

    非常感谢!

1 个答案:

答案 0 :(得分:0)

类似的东西:

library(dplyr)
df <- df %>% 
    group_by(fac) %>% 
    mutate(time = ceiling(1+difftime(as.Date(date), as.Date(date[1]), units = 'weeks')))
summarize(df, total_sessions = sum(sessions))