如何避免对数据框进行迭代

时间:2019-12-14 17:30:23

标签: python pandas

我有以下数据框(示例):

                                  time       t2m  ...        av      kont
latitude longitude                                ...                    
46.5     18.0      1998-01-12 07:00:00  0.284698  ...  0.001613          
         18.0      1998-01-24 08:00:00 -1.304504  ...  0.001418  FROMHERE
         18.0      1998-01-24 09:00:00 -1.113770  ...  0.002679          
         18.0      1998-01-24 17:00:00  0.345001  ...  0.004633  FROMHERE
         18.0      1998-01-24 18:00:00 -0.122498  ...  0.004400          
         18.0      1998-01-24 19:00:00  0.041565  ...  0.002184          
         18.0      1998-01-24 20:00:00  0.100861  ...  0.002220          
         18.0      1998-01-24 21:00:00  0.120636  ...  0.003083          
         18.0      1998-01-24 22:00:00 -0.615662  ...  0.004330          
         18.0      1998-01-24 23:00:00 -0.686798  ...  0.002404          
         18.0      1998-01-25 00:00:00 -0.743134  ...  0.000953          
         18.0      1998-01-29 02:00:00 -4.786346  ...  0.002984  FROMHERE

我需要对每一行执行功能,并将结果放入其他列中。

示例功能

def f1(t2m,av,d):
    return t2m*av+d

要注意的是,当前行输入了前一行的新值。 d0是已知的,并且d每次出现FROMWHERE时都需要重新启动。

所需的输出是:

                                  time       t2m  ...        av      kont   d
latitude longitude                                ...                    
46.5     18.0      1998-01-12 07:00:00  0.284698  ...  0.001613             d0
         18.0      1998-01-24 08:00:00 -1.304504  ...  0.001418  FROMHERE   d0
         18.0      1998-01-24 09:00:00 -1.113770  ...  0.002679             d[previous]+f1(t2m,av,d[previous])
         18.0      1998-01-24 17:00:00  0.345001  ...  0.004633  FROMHERE   d0
         18.0      1998-01-24 18:00:00 -0.122498  ...  0.004400             d[previous]+f1(t2m,av,d[previous])
         18.0      1998-01-24 19:00:00  0.041565  ...  0.002184             d[previous]+f1(t2m,av,d[previous])
         18.0      1998-01-24 20:00:00  0.100861  ...  0.002220             ...
         18.0      1998-01-24 21:00:00  0.120636  ...  0.003083          
         18.0      1998-01-24 22:00:00 -0.615662  ...  0.004330          
         18.0      1998-01-24 23:00:00 -0.686798  ...  0.002404          
         18.0      1998-01-25 00:00:00 -0.743134  ...  0.000953          
         18.0      1998-01-29 02:00:00 -4.786346  ...  0.002984  FROMHERE   d0

在不循环数据帧的情况下为实现这一目标提供的任何帮助都将受到赞赏。

1 个答案:

答案 0 :(得分:2)

定义以下功能:

def f2(row):
    if row.kont == 'FROMHERE':
        f2.prevD0 = d0
    f2.prevD0 = f1(row.t2m, row.av, f2.prevD0)
    return f2.prevD0

然后,假设您在 d0 中具有适当的值, 通过以下方式应用此功能,将结果保存在新列中:

f2.prevD0 = d0
df['d'] = df.apply(f2, axis=1)