我的数据框看起来像这样:
raw_data ={'col0':[1,4,5,1,3,3,1,5,8,9,1,2]}
df = DataFrame(raw_data)
col0
0 1
1 4
2 5
3 1
4 3
5 3
6 1
7 5
8 8
9 9
10 1
11 2
我想要做的是计算每3行以适应条件(df ['col0']> 3)并使新col看起来像这样:
col0 col_roll_count3
0 1 0
1 4 1
2 5 2 #[index 0,1,2/ 4,5 fit the condition]
3 1 2
4 3 1
5 3 0 #[index 3,4,5/no fit the condition]
6 1 0
7 5 1
8 8 2
9 9 3
10 1 2
11 2 1
我怎样才能做到这一点?
我尝试了但失败了:
df['col_roll_count3'] = df[df['col0']>3].rolling(3).count()
print(df)
col0 col1
0 1 NaN
1 4 1.0
2 5 2.0
3 1 NaN
4 3 NaN
5 3 NaN
6 1 NaN
7 5 3.0
8 8 3.0
9 9 3.0
10 1 NaN
11 2 NaN
答案 0 :(得分:1)
让我们使用rolling
,apply
,np.count_nonzero
:
df['col_roll_count3'] = df.col0.rolling(3,min_periods=1)\
.apply(lambda x: np.count_nonzero(x>3))
输出:
col0 col_roll_count3
0 1 0.0
1 4 1.0
2 5 2.0
3 1 2.0
4 3 1.0
5 3 0.0
6 1 0.0
7 5 1.0
8 8 2.0
9 9 3.0
10 1 2.0
11 2 1.0
答案 1 :(得分:1)
df['col_roll_count3'] = df['col0'].gt(3).rolling(3).sum()