Pandas随MultiIndex移动无法正常工作

时间:2018-03-23 04:27:43

标签: python pandas

基本上,当列是多索引时,pandas.DataFrame.shift不起作用:

鉴于这些值和当前设置:

idx = ['2018-03-14T06:15:39.000000000', '2018-03-14T06:16:15.000000000',
       '2018-03-14T06:16:50.000000000', '2018-03-14T06:17:47.000000000',
       '2018-03-14T06:18:46.000000000']


vals = [[9.15390039e+03, 9.99999978e-03, 1.64927383e+04, 4.00000000e+00,
         1.00000000e+00, 0.00000000e+00, 9.15388965e+03, 9.99999978e-03,
         1.64928926e+04, 0.00000000e+00, 0.00000000e+00, 1.00000000e+00,
         9.15388965e+03],
        [9.15390039e+03, 9.99999978e-03, 1.64847031e+04, 9.00000000e+00,
         1.00000000e+00, 0.00000000e+00, 9.15388965e+03, 9.99999978e-03,
         1.64848359e+04, 3.00000000e+00, 0.00000000e+00, 1.00000000e+00,
         9.15388965e+03],
        [9.15999023e+03, 9.99999978e-03, 1.64850938e+04, 7.00000000e+00,
         0.00000000e+00, 1.00000000e+00, 9.16000000e+03, 9.99999978e-03,
         1.64851660e+04, 2.00000000e+00, 1.00000000e+00, 0.00000000e+00,
         9.16000000e+03],
        [9.16424023e+03, 9.99999978e-03, 1.64821777e+04, 2.20000000e+01,
         0.00000000e+00, 1.00000000e+00, 9.16425000e+03, 9.99999978e-03,
         1.64848125e+04, 2.30000000e+01, 1.00000000e+00, 0.00000000e+00,
         9.16425000e+03],
        [9.16425000e+03, 9.99999978e-03, 1.64847891e+04, 1.00000000e+01,
         1.00000000e+00, 0.00000000e+00, 9.16424023e+03, 9.99999978e-03,
         1.64849219e+04, 1.20000000e+01, 0.00000000e+00, 1.00000000e+00,
         9.16424023e+03]]

cols = [('t_2', 'price'),
         ('t_2', 'spread'), 
         ('t_2', 'volume_24h'),
         ('t_2', 'time_diff'),
         ('t_2', 'buy'),
         ('t_2', 'sell'),
         ('t_1', 'price'),
         ('t_1', 'spread'),
         ('t_1', 'volume_24h'),
         ('t_1', 'time_diff'),
         ('t_1', 'buy'),
         ('t_1', 'sell'),
         ('t_0', 'target')]

 df = pandas.DataFrame(vals, index=idx, 
 columns=pandas.MultiIndex.from_tuples(cols))

 df['t_0']['target'] = df['t_0']['target'].shift(-1)
 df.head()

返回完全相同的数据帧,并且永远不会发生转换。我已经在这个问题上摸不着头脑了很长一段时间没有理解。

我错过了一些完全明显的东西吗?

2 个答案:

答案 0 :(得分:4)

您正在寻找

df[('t_0', 'target')] = df[('t_0', 'target')].shift(-1)

df[('t_0', 'target')]

2018-03-14T06:15:39.000000000    9153.88965
2018-03-14T06:16:15.000000000    9160.00000
2018-03-14T06:16:50.000000000    9164.25000
2018-03-14T06:17:47.000000000    9164.24023
2018-03-14T06:18:46.000000000           NaN
Name: (t_0, target), dtype: float64

注意,当您单独索引两次时,您将修改副本,而不是原始文件。

答案 1 :(得分:2)

多重索引

idx = pd.IndexSlice
df.loc[:,idx['t_0','target']]=df.loc[:,idx['t_0','target']].shift(-1)

df

                                      t_2                                    \
                                    price spread  volume_24h time_diff  buy   
2018-03-14T06:15:39.000000000  9153.90039   0.01  16492.7383       4.0  1.0   
2018-03-14T06:16:15.000000000  9153.90039   0.01  16484.7031       9.0  1.0   
2018-03-14T06:16:50.000000000  9159.99023   0.01  16485.0938       7.0  0.0   
2018-03-14T06:17:47.000000000  9164.24023   0.01  16482.1777      22.0  0.0   
2018-03-14T06:18:46.000000000  9164.25000   0.01  16484.7891      10.0  1.0   
                                           t_1                               \
                              sell       price spread  volume_24h time_diff   
2018-03-14T06:15:39.000000000  0.0  9153.88965   0.01  16492.8926       0.0   
2018-03-14T06:16:15.000000000  0.0  9153.88965   0.01  16484.8359       3.0   
2018-03-14T06:16:50.000000000  1.0  9160.00000   0.01  16485.1660       2.0   
2018-03-14T06:17:47.000000000  1.0  9164.25000   0.01  16484.8125      23.0   
2018-03-14T06:18:46.000000000  0.0  9164.24023   0.01  16484.9219      12.0   
                                                t_0  
                               buy sell      target  
2018-03-14T06:15:39.000000000  0.0  1.0  9153.88965  
2018-03-14T06:16:15.000000000  0.0  1.0  9160.00000  
2018-03-14T06:16:50.000000000  1.0  0.0  9164.25000  
2018-03-14T06:17:47.000000000  1.0  0.0  9164.24023  
2018-03-14T06:18:46.000000000  0.0  1.0         NaN