高效使用numpy select

时间:2019-05-31 00:21:30

标签: python pandas select

我有一些数据如下。我正在尝试计算Time bw列中的值(第四行应为0)。每当Location移至新位置时,例如a移至b时,我都希望Time bw从0重新开始。我正在尝试使用neselectdiff()

+----------+---------------------+----------+
| Location |         Date        | Time bw  |
+----------+---------------------+----------+
| a        | 2018-06-26 00:00:00 |        0 |
| a        | 2018-06-26 00:00:00 |        0 |
| a        | 2018-06-26 00:00:00 |        0 |
| b        | 2018-08-03 00:00:00 |       38 |
| b        | 2018-08-03 00:00:00 |        0 |
| b        | 2018-08-04 00:00:00 |        1 |
| b        | 2018-08-04 00:00:00 |        0 |
| b        | 2018-08-04 00:00:00 |        0 |
| b        | 2018-08-04 00:00:00 |        0 |
| b        | 2018-08-04 00:00:00 |        0 |
| b        | 2018-08-04 00:00:00 |        0 |
| b        | 2018-08-05 00:00:00 |        1 |
| b        | 2018-08-08 00:00:00 |        3 |
| b        | 2018-08-08 00:00:00 |        0 |
| b        | 2018-08-08 00:00:00 |        0 |
| b        | 2018-08-08 00:00:00 |        0 |
| b        | 2018-08-08 00:00:00 |        0 |
| c        | 2018-08-14 00:00:00 |        6 |
| c        | 2018-08-14 00:00:00 |        0 |
| c        | 2018-08-14 00:00:00 |        0 |
+----------+---------------------+----------+

1 个答案:

答案 0 :(得分:1)

IIUC:

df['Time bw'] = np.where(df.Location.ne(df.Location.shift()), 0, df['Time bw'])

输出:

    Location    Date    Time bw
0   a   20180626 00:00:00   0
1   a   20180626 00:00:00   0
2   a   20180626 00:00:00   0
3   b   20180803 00:00:00   0
4   b   20180803 00:00:00   0
5   b   20180804 00:00:00   1
6   b   20180804 00:00:00   0
7   b   20180804 00:00:00   0
8   b   20180804 00:00:00   0
9   b   20180804 00:00:00   0
10  b   20180804 00:00:00   0
11  b   20180805 00:00:00   1
12  b   20180808 00:00:00   3
13  b   20180808 00:00:00   0
14  b   20180808 00:00:00   0
15  b   20180808 00:00:00   0
16  b   20180808 00:00:00   0
17  c   20180814 00:00:00   0
18  c   20180814 00:00:00   0
19  c   20180814 00:00:00   0