在熊猫条件下递增添加

时间:2019-07-29 13:31:03

标签: python pandas

对于以下熊猫数据框

    servo_in_position   second_servo_in_position    Expected output
0   0   1   0
1   0   1   0
2   1   2   1
3   0   3   0
4   1   4   2
5   1   4   2
6   0   5   0
7   0   5   0
8   1   6   3
9   0   7   0
10  1   8   4
11  0   9   0
12  1   10  5
13  1   10  5
14  1   10  5
15  0   11  0
16  0   11  0
17  0   11  0
18  1   12  6
19  1   12  6
20  0   13  0
21  0   13  0
22  0   13  0

仅当“ servo_in_position”从0更改为1时,我才想增加“预期输出”列。如果“ servo_in_position”等于0,我还想假设“预期输出”为0(空)。

我尝试了

input_data['second_servo_in_position']=(input_data.servo_in_position.diff()!=0).cumsum()

但是它提供的输出如“ second_servo_in_position”列中所示,这不是我想要的。

之后,我想使用以下方法分组并计算均值:

print("Mean=\n\n",input_data.groupby('second_servo_in_position').mean())

5 个答案:

答案 0 :(得分:10)

使用cumsummask

df ['E_output'] = df ['servo_in_position']。diff()。eq(1).cumsum()\                                             .mask(df ['servo_in_position'] == 0,0)

df['servo_in_position'].diff().fillna(df['servo_in_position']).eq(1).cumsum()\
   .mask(df['servo_in_position'] == 0, 0)

输出:

    servo_in_position  second_servo_in_position  Expected output  E_output
0                   0                         1                0         0
1                   0                         1                0         0
2                   1                         2                1         1
3                   0                         3                0         0
4                   1                         4                2         2
5                   1                         4                2         2
6                   0                         5                0         0
7                   0                         5                0         0
8                   1                         6                3         3
9                   0                         7                0         0
10                  1                         8                4         4
11                  0                         9                0         0
12                  1                        10                5         5
13                  1                        10                5         5
14                  1                        10                5         5
15                  0                        11                0         0
16                  0                        11                0         0
17                  0                        11                0         0
18                  1                        12                6         6
19                  1                        12                6         6
20                  0                        13                0         0
21                  0                        13                0         0
22                  0                        13                0         0

更新第一个等于1的位置。

df['servo_in_position'].diff().fillna(df['servo_in_position']).eq(1).cumsum()\
   .mask(df['servo_in_position'] == 0, 0)

答案 1 :(得分:10)

使用cumsum和算术。


u = df['servo_in_position']

(u.eq(1) & u.shift().ne(1)).cumsum() * u

0     0
1     0
2     1
3     0
4     2
5     2
6     0
7     0
8     3
9     0
10    4
11    0
12    5
13    5
14    5
15    0
16    0
17    0
18    6
19    6
20    0
21    0
22    0
Name: servo_in_position, dtype: int64

答案 2 :(得分:7)

尝试np.where

df['Expected_output'] = np.where(df.servo_in_position.eq(1),
                                 df.servo_in_position.diff().eq(1).cumsum(),
                                 0)

答案 3 :(得分:6)

<script src="https://cdnjs.cloudflare.com/ajax/libs/react/16.6.3/umd/react.production.min.js"></script> <script src="https://cdnjs.cloudflare.com/ajax/libs/react-dom/16.6.3/umd/react-dom.production.min.js"></script> <div id="root"></div>cumsum

mul

答案 4 :(得分:4)

与Numba快速互动

from numba import njit

@njit
def f(u):
    out = np.zeros(len(u), np.int64)
    a = out[0] = u[0]
    for i in range(1, len(u)):
        if u[i] == 1:
            if u[i - 1] == 0:
                a += 1
            out[i] = a
    return out

f(df.servo_in_position.to_numpy())

array([0, 0, 1, 0, 2, 2, 0, 0, 3, 0, 4, 0, 5, 5, 5, 0, 0, 0, 6, 6, 0, 0, 0])