具有相同类别的R中子集的聚合

时间:2015-07-31 11:43:04

标签: r

我的输入是:

UserName    Date    Time    Module  ActiveTime
A   5/20/2015   10:00   E1  5
A   5/20/2015   10:01   E1  2
A   5/20/2015   10:02   O1  2
A   5/20/2015   10:05   Exp 4
A   5/20/2015   10:06   Exp 3
A   5/20/2015   10:06   O1  2
A   5/20/2015   10:06   Exp 5
A   5/20/2015   10:06   EXC 1
A   5/20/2015   10:06   EXC 2
A   5/20/2015   10:06   NOTE    1
B   5/20/2015   10:00   mstsc   3
B   5/20/2015   10:01   mstsc   4
B   5/20/2015   10:02   NOTE    1
B   5/20/2015   10:05   Exp 5
B   5/20/2015   10:06   Exp 1
B   5/20/2015   10:06   EXC 2
B   5/20/2015   10:06   Exp 5
B   5/20/2015   10:07   EXC 1
B   5/20/2015   10:08   EXC 2

现在我想为模块类别添加活动时间,直到我的模块更改不是所有组。所以输出看起来像:

UserName    Date    Time    Module  ActiveTime
A   5/20/2015   10:00   E1  7
A   5/20/2015   10:02   O1  2
A   5/20/2015   10:05   Exp 7
A   5/20/2015   10:06   O1  2
A   5/20/2015   10:06   Exp 5
A   5/20/2015   10:06   EXC 3
A   5/20/2015   10:06   NOTE    1
B   5/20/2015   10:00   mstsc   7
B   5/20/2015   10:02   NOTE    1
B   5/20/2015   10:05   Exp 6
B   5/20/2015   10:06   EXC 2
B   5/20/2015   10:06   Exp 5
B   5/20/2015   10:07   EXC 3

任何建议或想法。

1 个答案:

答案 0 :(得分:0)

试试这个,

假设数据框名称为df,那么

install.packages("dplyr") #do once if you don't have this library installed
install.packages("lubridate") #do once if you don't have this library installed

library(dplyr); library(lubridate)
newDf <- df %>% mutate(Time=hour(hm(Time))*60 + minute(hm(Time))) %>%
                group_by(UserName, Date, Module) %>% 
                summarise(Time=min(Time), ActiveTime=sum(ActiveTime)) %>% 
                mutate(Time=paste0(Time %/% 60, ":", 
                       ifelse(nchar(mod(Time, 60))==1, 
                              paste0("0", mod(Time, 60)), mod(Time, 60)))) %>% 
               select(UserName, Date, Time, Module, ActiveTime) %>% ungroup

View(newDf)

*逻辑*

- 将时间转换为分钟

- 计算sum的{​​{1}}和ActiveTime分钟minimumUserNameDate

- 将分钟转换回Module格式。