如何计算Pandas DataTable中未来的行百分比变化

时间:2017-11-14 16:01:04

标签: python pandas

我有一个Pandas DataTable,我试图将下一个x行的最大值计算为与当前行相比的百分比。例如,我有这样的事情:

|  datetime           |  open    |   high   |   low    |  close   |   volume    |
| 2016-12-06 14:00:00 | 0.009142 | 0.009152 | 0.008839 | 0.009038 |  888.080994 |
| 2016-12-06 15:00:00 | 0.009030 | 0.009200 | 0.008887 | 0.009076 | 1245.985840 |
| 2016-12-06 16:00:00 | 0.009070 | 0.009510 | 0.008992 | 0.009510 | 1630.514648 |
| 2016-12-06 17:00:00 | 0.009510 | 0.009889 | 0.009500 | 0.009677 | 2944.323730 |
| 2016-12-06 18:00:00 | 0.009677 | 0.009764 | 0.009400 | 0.009403 |  980.190186 |
| 2016-12-06 19:00:00 | 0.009410 | 0.009580 | 0.009361 | 0.009515 |  651.947754 |
| 2016-12-06 20:00:00 | 0.009515 | 0.010175 | 0.009510 | 0.009925 | 1637.252319 |
| 2016-12-06 21:00:00 | 0.009915 | 0.010430 | 0.009900 | 0.010383 | 2029.841675 |

我想添加一个列,显示接下来n行中的最高价格,表示为当前行开放值的百分比。

我已经达到了这个目标,从“高”栏到“高”栏(理想情况下我想使用open to high)。

periods = 3
df['high_pct'] = df['high'].rolling(periods).max().pct_change().shift(-periods)

但是这给我留下了一些零值,我无法弄清楚原因:

|      datetime       |   open   |   high   |   low    |  close   |   volume    | high_pct |
| 2016-12-06 14:00:00 | 0.009142 | 0.009152 | 0.008839 | 0.009038 |  888.080994 | 0.039841 |
| 2016-12-06 15:00:00 | 0.009030 | 0.009200 | 0.008887 | 0.009076 | 1245.985840 | 0.000000 |
| 2016-12-06 16:00:00 | 0.009070 | 0.009510 | 0.008992 | 0.009510 | 1630.514648 | 0.000000 |
| 2016-12-06 17:00:00 | 0.009510 | 0.009889 | 0.009500 | 0.009677 | 2944.323730 | 0.028932 |
| 2016-12-06 18:00:00 | 0.009677 | 0.009764 | 0.009400 | 0.009403 |  980.190186 | 0.025062 |
| 2016-12-06 19:00:00 | 0.009410 | 0.009580 | 0.009361 | 0.009515 |  651.947754 | 0.000947 |
| 2016-12-06 20:00:00 | 0.009515 | 0.010175 | 0.009510 | 0.009925 | 1637.252319 | 0.000000 |
| 2016-12-06 21:00:00 | 0.009915 | 0.010430 | 0.009900 | 0.010383 | 2029.841675 | 0.000000 |

我在正确的路线上吗?如果需要,有人可以建议采用不同的方法吗?

谢谢!

1 个答案:

答案 0 :(得分:0)

IIUC,我想你想要这样的事情:

df['high_pct'] = df['high'].rolling(3, min_periods=1).max().shift(-2).ffill() / df['high'] - 1

输出:

                datetime      open      high       low     close       volume  \
0   2016-12-06 14:00:00   0.009142  0.009152  0.008839  0.009038   888.080994   
1   2016-12-06 15:00:00   0.009030  0.009200  0.008887  0.009076  1245.985840   
2   2016-12-06 16:00:00   0.009070  0.009510  0.008992  0.009510  1630.514648   
3   2016-12-06 17:00:00   0.009510  0.009889  0.009500  0.009677  2944.323730   
4   2016-12-06 18:00:00   0.009677  0.009764  0.009400  0.009403   980.190186   
5   2016-12-06 19:00:00   0.009410  0.009580  0.009361  0.009515   651.947754   
6   2016-12-06 20:00:00   0.009515  0.010175  0.009510  0.009925  1637.252319   
7   2016-12-06 21:00:00   0.009915  0.010430  0.009900  0.010383  2029.841675   

   high_pct  
0  0.039117  
1  0.074891  
2  0.039853  
3  0.000000  
4  0.042093  
5  0.088727  
6  0.025061  
7  0.000000