我想重新计算给定dataframe = df的列“a”。但是我这样做的方式并没有填补旧的计算值。
import pandas as pd
import numpy as np
from numpy.random import randn
df = pd.DataFrame(randn(100))
df["a"] = np.nan
df["b"] = randn()
df.a[0] = 0.5
df.a= df.a.shift(1) * df.b
你有什么想法可以解决这个问题吗?
我想根据其先前的值“b”来计算“a”:
a b
0.5 2 #set as starting value with df.a[0] = 0.5 since there is no value for a prior to that, there's no calculation performed.
1.5 3 # a = previous value of a *b (0.5*3) =1.5
15 10 # a = previous value of a *b (1.5*10) =15
45 3 # a = previous value of a *b (15*3) =45
问题在于,钙化不执行/加速结果不会覆盖先前设定的值。
答案 0 :(得分:1)
这个怎么样?
df = pd.DataFrame({'a': [None] * 4, 'b': [2, 3, 10, 3]})
df.a.iloc[0] = 0.5
df.a.iloc[1:] = (df.b.shift(-1).cumprod() * df.a.iat[0])[:-1].values
>>> df
a b
0 0.5 2
1 1.5 3
2 15 10
3 45 3
答案 1 :(得分:0)
您可以使用for
循环执行此操作:
for i in df.index[1:]:
df.a.ix[i] = df.b.ix[i]*df.a.ix[i-1]
如果有人知道矢量化的方式,我有兴趣看到它。