Python在一个pandas列中查找日期之间的差异?

时间:2016-11-07 20:13:23

标签: python datetime pandas session-cookies

我正在尝试提出一种计算会话持续时间的方法。我的样本数据如下。我假设如果有人再次登录 - 他们开始一个新的会话,因此前一个会话应该已经结束。因此,在用户再次登录为会话持续时间之前,我将使用登录操作。

Action,Duration,_time,User
getForeignBugs,3,2016-11-07 15:45:18.992,savaithi
getServiceRequests,5,2016-11-07 15:45:18.902,savaithi
login,8088,2016-11-07 15:45:18.804,savaithi
getAuditTrail,550,2016-11-07 15:45:10.627,savaithi
getEnclosures,447,2016-11-07 15:45:09.994,savaithi
login,4810,2016-11-07 15:45:09.040,savaithi
getNoteTemplates,2,2016-11-07 15:45:04.220,savaithi
getQuickSearchInitInfo2,3,2016-11-07 15:45:01.995,savaithi
getQuickSearchInitInfo,3,2016-11-07 15:45:01.873,savaithi
login,0,2016-11-07 15:45:00.979,savaithi
getUserPreferences,2,2016-11-07 15:45:00.958,savaithi
getUserPreferences,2,2016-11-07 15:45:00.956,savaithi
SecurityServiceImpl.constructFromSession,2,2016-11-07 15:45:00.954,savaithi
setBooleanPreference,2,2016-11-07 15:45:00.954,savaithi
login,0,2016-11-07 15:45:00.658,savaithi
getPreference,1,2016-11-07 15:45:00.582,savaithi
getUserPreferences,129,2016-11-07 15:44:52.376,savaithi
login,2,2016-11-07 15:44:52.246,savaithi

如何在登录和登录[index-1]之间动态访问数据?

以下示例我想使用getPreference,1,2016-11-07 15:45:00.582 - login,2,2016-11-07 15:44:52.246

login,0,2016-11-07 15:45:00.658,savaithi
getPreference,1,2016-11-07 15:45:00.582,savaithi
getUserPreferences,129,2016-11-07 15:44:52.376,savaithi
login,2,2016-11-07 15:44:52.246,savaithi

1 个答案:

答案 0 :(得分:2)

IIUC你可以这样做:

首先让我们对DF进行排序:

In [71]: x = df.sort_values(['User','_time']).reset_index()

In [72]: x
Out[72]:
    index                                    Action  Duration                   _time      User
0      17                                     login         2 2016-11-07 15:44:52.246  savaithi
1      16                        getUserPreferences       129 2016-11-07 15:44:52.376  savaithi
2      15                             getPreference         1 2016-11-07 15:45:00.582  savaithi
3      14                                     login         0 2016-11-07 15:45:00.658  savaithi
4      12  SecurityServiceImpl.constructFromSession         2 2016-11-07 15:45:00.954  savaithi
5      13                      setBooleanPreference         2 2016-11-07 15:45:00.954  savaithi
6      11                        getUserPreferences         2 2016-11-07 15:45:00.956  savaithi
7      10                        getUserPreferences         2 2016-11-07 15:45:00.958  savaithi
8       9                                     login         0 2016-11-07 15:45:00.979  savaithi
9       8                    getQuickSearchInitInfo         3 2016-11-07 15:45:01.873  savaithi
10      7                   getQuickSearchInitInfo2         3 2016-11-07 15:45:01.995  savaithi
11      6                          getNoteTemplates         2 2016-11-07 15:45:04.220  savaithi
12      5                                     login      4810 2016-11-07 15:45:09.040  savaithi
13      4                             getEnclosures       447 2016-11-07 15:45:09.994  savaithi
14      3                             getAuditTrail       550 2016-11-07 15:45:10.627  savaithi
15      2                                     login      8088 2016-11-07 15:45:18.804  savaithi
16      1                        getServiceRequests         5 2016-11-07 15:45:18.902  savaithi
17      0                            getForeignBugs         3 2016-11-07 15:45:18.992  savaithi

现在让我们只报告那些Action == 'login'next.Action == 'login'加上最后一行

的行
In [34]: x.loc[(x.Action == 'login') | (x.Action.shift(-1) == 'login') | (x.index == x.index[-1])]
Out[34]:
    index              Action  Duration                   _time      User
0      17               login         2 2016-11-07 15:44:52.246  savaithi
2      15       getPreference         1 2016-11-07 15:45:00.582  savaithi
3      14               login         0 2016-11-07 15:45:00.658  savaithi
7      10  getUserPreferences         2 2016-11-07 15:45:00.958  savaithi
8       9               login         0 2016-11-07 15:45:00.979  savaithi
11      6    getNoteTemplates         2 2016-11-07 15:45:04.220  savaithi
12      5               login      4810 2016-11-07 15:45:09.040  savaithi
14      3       getAuditTrail       550 2016-11-07 15:45:10.627  savaithi
15      2               login      8088 2016-11-07 15:45:18.804  savaithi
17      0      getForeignBugs         3 2016-11-07 15:45:18.992  savaithi

In [35]: x.loc[(x.Action == 'login') | (x.Action.shift(-1) == 'login') | (x.index == x.index[-1]), '_time'].diff()
Out[35]:
0                NaT
2    00:00:08.336000
3    00:00:00.076000
7    00:00:00.300000
8    00:00:00.021000
11   00:00:03.241000
12   00:00:04.820000
14   00:00:01.587000
15   00:00:08.177000
17   00:00:00.188000
Name: _time, dtype: timedelta64[ns]