如何计算熊猫的多行差异?

时间:2019-11-05 08:29:21

标签: python pandas

我想计算每行与接下来的5行之间的差异,并返回所有这些值的最大值(仅适用于Nan),并对pandas数据框中的所有行重复相同的操作,最后将这些值打印在新列中。我已经尝试过.shift(1)函数,并尝试对所有行进行迭代处理,但是似乎很慢。

A'  B'  Output
AA  1   4
BB  2   3
CC  3   2
DD  4   1
EE  5   0

1 个答案:

答案 0 :(得分:0)

您尝试过diff吗?

import pandas as pd
df = pd.DataFrame({'a': [1,2,3,4,5,0,5,1,4,3,2]})
col_n = []
diff_r = 5
for i in range(1, diff_r+1):
  col_n.append('d_'+str(i))
  df['d_'+str(i)] = df['a'].diff(i).shift(periods=-i)
df['d_abs_max'] = df[col_n].abs().max(axis=1)
df['d_max'] = df[col_n].max(axis=1)
print(df)

    a  d_1  d_2  d_3  d_4  d_5  d_abs_max  d_max
0   1  1.0  2.0  3.0  4.0 -1.0        4.0    4.0
1   2  1.0  2.0  3.0 -2.0  3.0        3.0    3.0
2   3  1.0  2.0 -3.0  2.0 -2.0        3.0    2.0
3   4  1.0 -4.0  1.0 -3.0  0.0        4.0    1.0
4   5 -5.0  0.0 -4.0 -1.0 -2.0        5.0    0.0
5   0  5.0  1.0  4.0  3.0  2.0        5.0    5.0
6   5 -4.0 -1.0 -2.0 -3.0  NaN        4.0   -1.0
7   1  3.0  2.0  1.0  NaN  NaN        3.0    3.0
8   4 -1.0 -2.0  NaN  NaN  NaN        2.0   -1.0
9   3 -1.0  NaN  NaN  NaN  NaN        1.0   -1.0
10  2  NaN  NaN  NaN  NaN  NaN        NaN    NaN