我想计算每行与接下来的5行之间的差异,并返回所有这些值的最大值(仅适用于Nan),并对pandas数据框中的所有行重复相同的操作,最后将这些值打印在新列中。我已经尝试过.shift(1)
函数,并尝试对所有行进行迭代处理,但是似乎很慢。
A' B' Output
AA 1 4
BB 2 3
CC 3 2
DD 4 1
EE 5 0
答案 0 :(得分:0)
您尝试过diff吗?
import pandas as pd
df = pd.DataFrame({'a': [1,2,3,4,5,0,5,1,4,3,2]})
col_n = []
diff_r = 5
for i in range(1, diff_r+1):
col_n.append('d_'+str(i))
df['d_'+str(i)] = df['a'].diff(i).shift(periods=-i)
df['d_abs_max'] = df[col_n].abs().max(axis=1)
df['d_max'] = df[col_n].max(axis=1)
print(df)
a d_1 d_2 d_3 d_4 d_5 d_abs_max d_max
0 1 1.0 2.0 3.0 4.0 -1.0 4.0 4.0
1 2 1.0 2.0 3.0 -2.0 3.0 3.0 3.0
2 3 1.0 2.0 -3.0 2.0 -2.0 3.0 2.0
3 4 1.0 -4.0 1.0 -3.0 0.0 4.0 1.0
4 5 -5.0 0.0 -4.0 -1.0 -2.0 5.0 0.0
5 0 5.0 1.0 4.0 3.0 2.0 5.0 5.0
6 5 -4.0 -1.0 -2.0 -3.0 NaN 4.0 -1.0
7 1 3.0 2.0 1.0 NaN NaN 3.0 3.0
8 4 -1.0 -2.0 NaN NaN NaN 2.0 -1.0
9 3 -1.0 NaN NaN NaN NaN 1.0 -1.0
10 2 NaN NaN NaN NaN NaN NaN NaN