Question

我有一个看起来像这样的广告df：

user_id   session_id     timestamp          
141.0      1.0   20190418 02:23:56.000 
141.0      2.0   20190416 19:51:57.000 
141.0      3.0   20190415 14:47:53.000   
121.0      4.0   20190414 13:57:55.000    
121.0      5.0   20190414 06:23:01.000  
121.0      6.0   20190412 15:32:57.000

我正在尝试将lambda函数与一个组一起应用，该组将为每个user_id计算从会话时间戳记起的最近24小时内的会话数：

结果应为：


user_id   session_id     timestamp            24-HourCount  
141.0      1.0   20190418 02:23:56.000             0
141.0      2.0   20190416 19:51:57.000             0
141.0      3.0   20190415 14:47:53.000             na  
121.0      4.0   20190414 13:57:55.000             3
121.0      5.0   20190414 06:23:01.000             1
121.0      6.0   20190413 15:32:57.000             na

我试图进行分组并计算行数（所有会话都是不同的值），但出现错误。

df['24-HourCount'] = df.groupby('user_id')['timestamp'].transform(lambda x:\
          x.between(x.max()- dt.timedelta(days=1),x.max())).count()))

tried also applying the function:
def func(dfx):
    k=dfx[dfx.between(dfx[0]-dt.timedelta(days=1),dfx[0])].count()
    return(k)

df['24-HourCount']=df.groupby('user_id').apply(func)

谢谢！

分组并计算最近24小时的值

0 个答案: