我有一个用户日志的数据框(我的输入):
user_id log_category client_ts
1 Rob user 1455035670
2 Fred progression 1455035345
3 Rob design 1455035547
4 Rob design 1455035870
5 Fred user 1455035970
6 Fred progression 1455035548
我想要的只是:知道所有user_id(输出)的最后一个日志client_ts
:
user_id client_ts
1 Rob 1455035870
2 Fred 1455035970
如果最后is_leave
(输出),则添加到我的输入列名称yes
,其中包含因子last_log_ts < 1455035950
:
user_id log_category client_ts last_log_ts is_leave
1 Rob user 1455035670 1455035870 yes
2 Fred progression 1455035345 1455035970 no
3 Rob design 1455035547 1455035870 yes
4 Rob design 1455035870 1455035870 yes
5 Fred user 1455035970 1455035970 no
6 Fred progression 1455035548 1455035970 no
答案 0 :(得分:2)
使用data.table
我们可以这样做:
library(data.table)
setDT(df)[,last_log_ts := max(client_ts), user_id][,is_leave := ifelse(last_log_ts < 1455035950,"yes","no")]
> df
# user_id log_category client_ts last_log_ts is_leave
#1: Rob user 1455035670 1455035870 yes
#2: Fred progression 1455035345 1455035970 no
#3: Rob design 1455035547 1455035870 yes
#4: Rob design 1455035870 1455035870 yes
#5: Fred user 1455035970 1455035970 no
#6: Fred progression 1455035548 1455035970 no