Question

我有一个如下所示的熊猫数据框，

    flag a  b   c
0   1   5   1   3
1   1   2   1   3
2   1   3   0   3
3   1   4   0   3
4   1   5   5   3
5   1   6   0   3
6   1   7   0   3
7   2   6   1   4
8   2   2   1   4
9   2   3   1   4
10  2   4   1   4

我要根据以下条件创建列“ d”：

1）对于每个标志的第一行，如果a> c，则d = b，否则d = nan

2）对于每个标志的非第一行，如果（a> c）＆（（d的上一行是nan）|（b> d的上一行）），d = b，否则d =上一行的

我期望以下输出：

    flag a  b   c  d
0   1   5   1   3  1
1   1   2   1   3  1
2   1   3   0   3  1
3   1   4   0   3  1
4   1   5   5   3  5
5   1   6   0   3  5
6   1   7   0   3  5
7   2   6   1   4  1
8   2   2   1   4  1
9   2   3   1   4  1
10  2   4   1   4  1

Answer 1

这是我如何翻译您的逻辑：

df['d'] = np.nan

# first row of flag
s = df.flag.ne(df.flag.shift())

# where a > c
a_gt_c = df['a'].gt(df['c'])

# fill the first rows with a > c
df.loc[s & a_gt_c, 'd'] = df['b']

# mask for second fill
mask = ((~s)                                # not first rows 
        & a_gt_c                            # a > c
        & (df['d'].shift().isna()           # previous d not null 
           | df['b'].gt(df['d']).shift())   # or b > previous d
       )

# fill those values:
df.loc[mask, 'd'] = df['b']

# ffill for the rest
df['d'] = df['d'].ffill()

输出：

    flag  a  b  c    d
0      1  5  1  3  1.0
1      1  2  1  3  1.0
2      1  3  0  3  1.0
3      1  4  0  3  0.0
4      1  5  5  3  5.0
5      1  6  0  3  0.0
6      1  7  0  3  0.0
7      2  6  1  4  1.0
8      2  2  1  4  1.0
9      2  3  1  4  1.0
10     2  4  1  4  1.0

使用向量化使用条件python创建变量

1 个答案: