我有一个数据框' df'以下数据。我需要在PX_LAST列上执行对称百分比更改(下面的公式)。
xt = 200 *(X [t] - X [t-1])/(X [t] + X [t-1])
我的代码:
z = []
for j in range(1, len(df.index)):
z.append((df.iloc[j, 1] - df.iloc[j - 1, 1]) / (df.iloc[j, 1] + df.iloc[j - 1, 1]) * 200.00)
这很好,它的返回列表为Output。有更好的方法来执行此计算并将输出作为数据帧返回。
date PX_LAST
0 2011-01-31 10.9
1 2011-02-28 10.5
2 2011-03-31 11.2
3 2011-04-30 4.9
4 2011-05-31 24.3
5 2011-06-30 8.4
6 2011-07-31 37.2
7 2011-08-31 93.2
数据框中的预期输出:
0 Nan
1 -3.73831
2 6.45161
3 -78.2608
4 132.8767
5 -97.24770
6 85.88957
7 -158.4615
谢谢!
答案 0 :(得分:1)
选项1
p = df.PX_LAST
(p.diff() / p.rolling(2).sum() * 200).fillna(0)
0 0.000000
1 -3.738318
2 6.451613
3 -78.260870
4 132.876712
5 -97.247706
6 126.315789
7 85.889571
Name: PX_LAST, dtype: float64
p = df.PX_LAST
df.assign(SymPct=(p.diff() / p.rolling(2).sum() * 200).fillna(0))
date PX_LAST SymPct
0 2011-01-31 10.9 0.000000
1 2011-02-28 10.5 -3.738318
2 2011-03-31 11.2 6.451613
3 2011-04-30 4.9 -78.260870
4 2011-05-31 24.3 132.876712
5 2011-06-30 8.4 -97.247706
6 2011-07-31 37.2 126.315789
7 2011-08-31 93.2 85.889571
选项2
Numpy方法
p = df.PX_LAST.values
np.diff(p) / (p[:-1] + p[1:]) * 200
array([ -3.73831776, 6.4516129 , -78.26086957, 132.87671233,
-97.24770642, 126.31578947, 85.88957055])
p = df.PX_LAST.values
df.assign(SymPct=np.append(0, np.diff(p) / (p[:-1] + p[1:]) * 200)
date PX_LAST SymPct
0 2011-01-31 10.9 0.000000
1 2011-02-28 10.5 -3.738318
2 2011-03-31 11.2 6.451613
3 2011-04-30 4.9 -78.260870
4 2011-05-31 24.3 132.876712
5 2011-06-30 8.4 -97.247706
6 2011-07-31 37.2 126.315789
7 2011-08-31 93.2 85.889571
选项3
更优雅一点
p = df.PX_LAST.values
t0, t1 = p[:-1], p[1:]
df.assign(SymPct=np.append(0, (t1 - t0) / (t1 + t0) * 200))
date PX_LAST SymPct
0 2011-01-31 10.9 0.000000
1 2011-02-28 10.5 -3.738318
2 2011-03-31 11.2 6.451613
3 2011-04-30 4.9 -78.260870
4 2011-05-31 24.3 132.876712
5 2011-06-30 8.4 -97.247706
6 2011-07-31 37.2 126.315789
7 2011-08-31 93.2 85.889571
答案 1 :(得分:1)
选项1] 使用transform
In [517]: df.PX_LAST.transform(lambda x: 200*(x - x.shift())/(x + x.shift()))
Out[517]:
0 NaN
1 -3.738318
2 6.451613
3 -78.260870
4 132.876712
5 -97.247706
6 126.315789
7 85.889571
Name: PX_LAST, dtype: float64
选项2] Vectorize
In [522]: px = df.PX_LAST
In [523]: pxs = df.PX_LAST.shift()
In [524]: 200 * (px - pxs)/(px + pxs)
Out[524]:
0 NaN
1 -3.738318
2 6.451613
3 -78.260870
4 132.876712
5 -97.247706
6 126.315789
7 85.889571
Name: PX_LAST, dtype: float64
返回包含结果的新数据框
In [525]: df.assign(SymPct=200 * (px - pxs)/(px + pxs))
Out[525]:
date PX_LAST SymPct
0 2011-01-31 10.9 NaN
1 2011-02-28 10.5 -3.738318
2 2011-03-31 11.2 6.451613
3 2011-04-30 4.9 -78.260870
4 2011-05-31 24.3 132.876712
5 2011-06-30 8.4 -97.247706
6 2011-07-31 37.2 126.315789
7 2011-08-31 93.2 85.889571
或者,添加到现有的df
In [527]: df['SymPct'] = 200 * (px - pxs)/(px + pxs)
In [528]: df
Out[528]:
date PX_LAST SymPct
0 2011-01-31 10.9 NaN
1 2011-02-28 10.5 -3.738318
2 2011-03-31 11.2 6.451613
3 2011-04-30 4.9 -78.260870
4 2011-05-31 24.3 132.876712
5 2011-06-30 8.4 -97.247706
6 2011-07-31 37.2 126.315789
7 2011-08-31 93.2 85.889571