我正在尝试提出一种计算会话持续时间的方法。我的样本数据如下。我假设如果有人再次登录 - 他们开始一个新的会话,因此前一个会话应该已经结束。因此,在用户再次登录为会话持续时间之前,我将使用登录操作。
Action,Duration,_time,User
getForeignBugs,3,2016-11-07 15:45:18.992,savaithi
getServiceRequests,5,2016-11-07 15:45:18.902,savaithi
login,8088,2016-11-07 15:45:18.804,savaithi
getAuditTrail,550,2016-11-07 15:45:10.627,savaithi
getEnclosures,447,2016-11-07 15:45:09.994,savaithi
login,4810,2016-11-07 15:45:09.040,savaithi
getNoteTemplates,2,2016-11-07 15:45:04.220,savaithi
getQuickSearchInitInfo2,3,2016-11-07 15:45:01.995,savaithi
getQuickSearchInitInfo,3,2016-11-07 15:45:01.873,savaithi
login,0,2016-11-07 15:45:00.979,savaithi
getUserPreferences,2,2016-11-07 15:45:00.958,savaithi
getUserPreferences,2,2016-11-07 15:45:00.956,savaithi
SecurityServiceImpl.constructFromSession,2,2016-11-07 15:45:00.954,savaithi
setBooleanPreference,2,2016-11-07 15:45:00.954,savaithi
login,0,2016-11-07 15:45:00.658,savaithi
getPreference,1,2016-11-07 15:45:00.582,savaithi
getUserPreferences,129,2016-11-07 15:44:52.376,savaithi
login,2,2016-11-07 15:44:52.246,savaithi
如何在登录和登录[index-1]之间动态访问数据?
以下示例我想使用getPreference,1,2016-11-07 15:45:00.582
- login,2,2016-11-07 15:44:52.246
login,0,2016-11-07 15:45:00.658,savaithi
getPreference,1,2016-11-07 15:45:00.582,savaithi
getUserPreferences,129,2016-11-07 15:44:52.376,savaithi
login,2,2016-11-07 15:44:52.246,savaithi
答案 0 :(得分:2)
IIUC你可以这样做:
首先让我们对DF进行排序:
In [71]: x = df.sort_values(['User','_time']).reset_index()
In [72]: x
Out[72]:
index Action Duration _time User
0 17 login 2 2016-11-07 15:44:52.246 savaithi
1 16 getUserPreferences 129 2016-11-07 15:44:52.376 savaithi
2 15 getPreference 1 2016-11-07 15:45:00.582 savaithi
3 14 login 0 2016-11-07 15:45:00.658 savaithi
4 12 SecurityServiceImpl.constructFromSession 2 2016-11-07 15:45:00.954 savaithi
5 13 setBooleanPreference 2 2016-11-07 15:45:00.954 savaithi
6 11 getUserPreferences 2 2016-11-07 15:45:00.956 savaithi
7 10 getUserPreferences 2 2016-11-07 15:45:00.958 savaithi
8 9 login 0 2016-11-07 15:45:00.979 savaithi
9 8 getQuickSearchInitInfo 3 2016-11-07 15:45:01.873 savaithi
10 7 getQuickSearchInitInfo2 3 2016-11-07 15:45:01.995 savaithi
11 6 getNoteTemplates 2 2016-11-07 15:45:04.220 savaithi
12 5 login 4810 2016-11-07 15:45:09.040 savaithi
13 4 getEnclosures 447 2016-11-07 15:45:09.994 savaithi
14 3 getAuditTrail 550 2016-11-07 15:45:10.627 savaithi
15 2 login 8088 2016-11-07 15:45:18.804 savaithi
16 1 getServiceRequests 5 2016-11-07 15:45:18.902 savaithi
17 0 getForeignBugs 3 2016-11-07 15:45:18.992 savaithi
现在让我们只报告那些Action == 'login'
或next.Action == 'login'
加上最后一行
In [34]: x.loc[(x.Action == 'login') | (x.Action.shift(-1) == 'login') | (x.index == x.index[-1])]
Out[34]:
index Action Duration _time User
0 17 login 2 2016-11-07 15:44:52.246 savaithi
2 15 getPreference 1 2016-11-07 15:45:00.582 savaithi
3 14 login 0 2016-11-07 15:45:00.658 savaithi
7 10 getUserPreferences 2 2016-11-07 15:45:00.958 savaithi
8 9 login 0 2016-11-07 15:45:00.979 savaithi
11 6 getNoteTemplates 2 2016-11-07 15:45:04.220 savaithi
12 5 login 4810 2016-11-07 15:45:09.040 savaithi
14 3 getAuditTrail 550 2016-11-07 15:45:10.627 savaithi
15 2 login 8088 2016-11-07 15:45:18.804 savaithi
17 0 getForeignBugs 3 2016-11-07 15:45:18.992 savaithi
In [35]: x.loc[(x.Action == 'login') | (x.Action.shift(-1) == 'login') | (x.index == x.index[-1]), '_time'].diff()
Out[35]:
0 NaT
2 00:00:08.336000
3 00:00:00.076000
7 00:00:00.300000
8 00:00:00.021000
11 00:00:03.241000
12 00:00:04.820000
14 00:00:01.587000
15 00:00:08.177000
17 00:00:00.188000
Name: _time, dtype: timedelta64[ns]