Indentify consecutive cells by a condition value

时间:2019-04-08 12:56:08

标签: python pandas

I would like to know how to make an extra column on the below dataframe that will be 1 when on the age column are 3 or more consecutive values bigger than 35

Data

age
0   12
1   50
2   49
3   29
4   55
5   34
6   23
7   46
8   87
9   39

desired output:

   age  flag
0   12     0
1   50     0
2   49     0
3   29     0
4   55     0
5   34     0
6   23     0
7   46     1
8   87     1
9   39     1

How could I do that? thanks

1 个答案:

答案 0 :(得分:3)

First compare values by Series.gt for >, then create consecutive groups by shift with cumsum, last grouping by groups and get counts with GroupBy.transform - compare by Series.ge and chain with original s for prevent set 3 consecutive >35 values, last set values to integers for True/False to 1/0 mapping:

s = df['age'].gt(35)
g = s.ne(s.shift()).cumsum()

df['flag'] = (s.groupby(g).transform('size').ge(3) & s).astype(int)
print (df)
   age  flag
0   12     0
1   50     0
2   49     0
3   29     0
4   55     0
5   34     0
6   23     0
7   46     1
8   87     1
9   39     1