在时间序列上使用公式

时间:2019-01-07 12:01:12

标签: python pandas time-series

我有一个时间序列的气候数据和一些恒定的大气值(气压等),我想在公式中使用这些序列来计算潜在的蒸散量。该公式如下所示:

LET_a= (delta * (Rnet + G) + pa * cp * (VPD/Ra)) / (delta + pc * (1 + rs/Ra))

cp是一个常量值(整数) rs是一个常数(浮点数) 其余的都是具有300000加值的系列。

样本df。具有相似的值。

    df=pd.DataFrame([[0.078,-61.36,49.56,1.248,0.155,468.57],[0.077,-58.38,50.14,1.249,0.13,1968.02],[0.078,-54.44,50.36,1.249,0.12,3061.366]])

    df.columns=['delta', 'Rnet', 'G', 'pa', 'VPD', 'Ra',]

    cp=1005
    rs=79.36
    pc=0.0663

这三行的预期结果应为:  -3.25,-3.77,-1.83

下面的代码行与上面的代码相同,只是手动输入(使用示例数据的第一行,并产生正确的结果。

    LET_1 = (0.078 * (-61.36+49.56)+1.248*1005*(0.155/468.57)) /( 0.078 + 0.0663 * (1 + 79.37/468.57))

LET_1 = -3.25

问题是运行此代码时数字发生了某些事情(我没有得到正确的结果),我也不知道为什么。

是因为系列和浮点数的混合吗?是否需要更多()或不同的写作方式?

理论上,应该将公式应用于序列中的每个单个值,并产生一个新的序列。

当我对数字使用相同的公式时,我得到正确的结果。因此,它必须是代码“拼写”中的内容。

任何建议都将不胜感激!

2 个答案:

答案 0 :(得分:0)

df["LET_a"] = df.apply(lambda x: x.delta * (x.Rnet + x.G) + x.pa * cp * (x.VPD/x.Ra) / x.delta + pc * (1 + rs/x.Ra), axis=1)
df

    delta   Rnet      G      pa     VPD      Ra           LET_a
0   0.078   -61.36  49.56   1.248   0.155   468.570     5.174052
1   0.077   -58.38  50.14   1.249   0.130   1968.020    1.132096
2   0.078   -54.44  50.36   1.249   0.120   3061.366    0.992759

答案 1 :(得分:0)

You can easily calculate your data with the following line:

LET_a = (df.delta * (df.Rnet + df.G) + df.pa * cp * (df.VPD / df.Ra)) / (df.delta + pc * (1 + rs / df.Ra))

If you want to add the result as a column to your DataFrame, you can write it like this:

df['LET_a'] = (df.delta * (df.Rnet + df.G) + df.pa * cp * (df.VPD / df.Ra)) / (df.delta + pc * (1 + rs / df.Ra))

Both yield the same result:

# Out:    0    -3.250232
#         1    -3.778515
#         2    -1.842481
#      dtype: float64

There are a few issues with your code:

  1. Variable naming conventions: Variables should be named in lowercase, underscores may be used. For dataframe columns you can also use other naming styles, but I still recommend sticking to the lowercase style, since this will make sure that you can always access the data with "dots" instead of having to write something like df['A BC'] explicitly. Constants should be named in uppercase, if they are global or on modul-level. Here is the python styleguide on variable naming.
  2. Did you check the bracketing of your code? I cannot calculate the correct result of your code, even when typing the formula by hand. It looks like you misplaced some brackets. Bracketing rules in python are the same as for any other mathematical equation.