在计算pandas数据帧的对称百分比变化时需要帮助 - python

时间:2017-09-15 19:04:05

标签: python pandas dataframe

我有一个数据框' df'以下数据。我需要在PX_LAST列上执行对称百分比更改(下面的公式)。

xt = 200 *(X [t] - X [t-1])/(X [t] + X [t-1])

我的代码:

z = []  
for j in range(1, len(df.index)):
    z.append((df.iloc[j, 1] - df.iloc[j - 1, 1]) / (df.iloc[j, 1] + df.iloc[j - 1, 1]) * 200.00)

这很好,它的返回列表为Output。有更好的方法来执行此计算并将输出作为数据帧返回。

        date             PX_LAST       
0      2011-01-31           10.9    
1      2011-02-28           10.5    
2      2011-03-31           11.2  
3      2011-04-30            4.9   
4      2011-05-31           24.3  
5      2011-06-30            8.4  
6      2011-07-31           37.2  
7      2011-08-31           93.2  

数据框中的预期输出:

0      Nan
1      -3.73831
2      6.45161
3     -78.2608
4      132.8767
5     -97.24770
6      85.88957
7     -158.4615

谢谢!

2 个答案:

答案 0 :(得分:1)

选项1

p = df.PX_LAST
(p.diff() / p.rolling(2).sum() * 200).fillna(0)

0      0.000000
1     -3.738318
2      6.451613
3    -78.260870
4    132.876712
5    -97.247706
6    126.315789
7     85.889571
Name: PX_LAST, dtype: float64
p = df.PX_LAST
df.assign(SymPct=(p.diff() / p.rolling(2).sum() * 200).fillna(0))

         date  PX_LAST      SymPct
0  2011-01-31     10.9    0.000000
1  2011-02-28     10.5   -3.738318
2  2011-03-31     11.2    6.451613
3  2011-04-30      4.9  -78.260870
4  2011-05-31     24.3  132.876712
5  2011-06-30      8.4  -97.247706
6  2011-07-31     37.2  126.315789
7  2011-08-31     93.2   85.889571

选项2
Numpy方法

p = df.PX_LAST.values
np.diff(p) / (p[:-1] + p[1:]) * 200

array([  -3.73831776,    6.4516129 ,  -78.26086957,  132.87671233,
        -97.24770642,  126.31578947,   85.88957055])
p = df.PX_LAST.values
df.assign(SymPct=np.append(0, np.diff(p) / (p[:-1] + p[1:]) * 200)

         date  PX_LAST      SymPct
0  2011-01-31     10.9    0.000000
1  2011-02-28     10.5   -3.738318
2  2011-03-31     11.2    6.451613
3  2011-04-30      4.9  -78.260870
4  2011-05-31     24.3  132.876712
5  2011-06-30      8.4  -97.247706
6  2011-07-31     37.2  126.315789
7  2011-08-31     93.2   85.889571

选项3
更优雅一点

p = df.PX_LAST.values
t0, t1 = p[:-1], p[1:]
df.assign(SymPct=np.append(0, (t1 - t0) / (t1 + t0) * 200))

         date  PX_LAST      SymPct
0  2011-01-31     10.9    0.000000
1  2011-02-28     10.5   -3.738318
2  2011-03-31     11.2    6.451613
3  2011-04-30      4.9  -78.260870
4  2011-05-31     24.3  132.876712
5  2011-06-30      8.4  -97.247706
6  2011-07-31     37.2  126.315789
7  2011-08-31     93.2   85.889571

答案 1 :(得分:1)

选项1] 使用transform

In [517]: df.PX_LAST.transform(lambda x: 200*(x - x.shift())/(x + x.shift()))
Out[517]:
0           NaN
1     -3.738318
2      6.451613
3    -78.260870
4    132.876712
5    -97.247706
6    126.315789
7     85.889571
Name: PX_LAST, dtype: float64

选项2] Vectorize

In [522]: px = df.PX_LAST

In [523]: pxs = df.PX_LAST.shift()

In [524]: 200 * (px - pxs)/(px + pxs)
Out[524]:
0           NaN
1     -3.738318
2      6.451613
3    -78.260870
4    132.876712
5    -97.247706
6    126.315789
7     85.889571
Name: PX_LAST, dtype: float64

返回包含结果的新数据框

In [525]: df.assign(SymPct=200 * (px - pxs)/(px + pxs))
Out[525]:
         date  PX_LAST      SymPct
0  2011-01-31     10.9         NaN
1  2011-02-28     10.5   -3.738318
2  2011-03-31     11.2    6.451613
3  2011-04-30      4.9  -78.260870
4  2011-05-31     24.3  132.876712
5  2011-06-30      8.4  -97.247706
6  2011-07-31     37.2  126.315789
7  2011-08-31     93.2   85.889571

或者,添加到现有的df

In [527]: df['SymPct'] = 200 * (px - pxs)/(px + pxs)

In [528]: df
Out[528]:
         date  PX_LAST      SymPct
0  2011-01-31     10.9         NaN
1  2011-02-28     10.5   -3.738318
2  2011-03-31     11.2    6.451613
3  2011-04-30      4.9  -78.260870
4  2011-05-31     24.3  132.876712
5  2011-06-30      8.4  -97.247706
6  2011-07-31     37.2  126.315789
7  2011-08-31     93.2   85.889571