将带有索引-OHLC数据的DataFrame转换为索引-O索引-L索引-H索引-O(展平OHLC数据)

时间:2013-05-13 07:11:18

标签: python numpy pandas

我有一个看起来像

的DataFrame
                 Open       High     Low      Close   Volume (BTC)  Volume (Currency)  Weighted Price
Date                                                                                                 
2013-05-07  112.25000  114.00000   97.52  109.60013  139626.724860    14898971.673747      106.705731
2013-05-08  109.60013  116.77700  109.50  113.20000   61680.324704     6990518.957611      113.334665
2013-05-09  113.20000  113.71852  108.80  112.79900   26894.458204     3003068.410660      111.661235
2013-05-10  112.79900  122.50000  111.54  117.70000   77443.672681     9140709.083964      118.030418
2013-05-11  117.70000  118.74000  113.00  113.47000   25532.277740     2952016.798507      115.619015

我正在寻找一种将此类数据转换为

的方法
index       open
index+1     low
index+2     high
index+3     open
index+4     low
index+5     high

因此,在我的示例中,它应该看起来像

Date
2013-05-07 00:00     112.25000
2013-05-07 08:00     97.52
2013-05-07 16:00     114.00000
2013-05-08 00:00     109.60013
2013-05-08 08:00     109.50
2013-05-08 16:00     116.77700    
...

我的第一个想法是重新采样DataFrame

但我的第一个问题是,当我在做

df2 = df.resample('8H', how='mean')

我得到了

                          Open       High        Low      Close   Volume (BTC)  Volume (Currency)  Weighted Price

2013-05-07 00:00:00  112.25000  114.00000   97.52000  109.60013  139626.724860    14898971.673747      106.705731
2013-05-07 08:00:00        NaN        NaN        NaN        NaN            NaN                NaN             NaN
2013-05-07 16:00:00        NaN        NaN        NaN        NaN            NaN                NaN             NaN
2013-05-08 00:00:00  109.60013  116.77700  109.50000  113.20000   61680.324704     6990518.957611      113.334665
2013-05-08 08:00:00        NaN        NaN        NaN        NaN            NaN                NaN             NaN
2013-05-08 16:00:00        NaN        NaN        NaN        NaN            NaN                NaN             NaN
2013-05-09 00:00:00  113.20000  113.71852  108.80000  112.79900   26894.458204     3003068.410660      111.661235
...

我现在需要构建一个模3值的列

喜欢这个

                        ModCol 

2013-05-07 00:00:00          0 
2013-05-07 08:00:00          1 
2013-05-07 16:00:00          2 
2013-05-08 00:00:00          0 
2013-05-08 08:00:00          1 
2013-05-08 16:00:00          2 
2013-05-09 00:00:00          3 
...

所以我会使用np.where来制作价格列 (如果Mod == 0则打开,如果Mod == 1则为低,如果Mod == 2则为高)

我的问题,如果我不知道如何构建ModCol列

1 个答案:

答案 0 :(得分:1)

继承人如何创建mod列

In [1]: Series(range(10))
Out[1]: 
0    0
1    1
2    2
3    3
4    4
5    5
6    6
7    7
8    8
9    9
dtype: int64

In [2]: Series(range(10)) % 3
Out[2]: 
0    0
1    1
2    2
3    0
4    1
5    2
6    0
7    1
8    2
9    0
dtype: int64