如何从pandas中的列拆分数据?

时间:2018-01-26 12:31:11

标签: pandas select time-series

我有像下面这样的数据框

Date       Time           
2017-12-01 00:00:00  21.64
           00:15:00  21.72
           00:30:00  21.57
           00:45:00  21.47
           01:00:00  21.42
           01:15:00  21.44
           01:30:00  21.48
           01:45:00  21.32
           02:00:00  21.27
           02:15:00  21.29
           02:30:00  21.20
           02:45:00  21.18
           03:00:00  21.19  
2017-12-02 00:00:00  22.78
           00:15:00  22.67
           00:30:00  22.54
           00:45:00  22.55

我想分割每日数据df1是00:00:00和 df2是00:15:00~03:00:00

我该怎么办?

1 个答案:

答案 0 :(得分:1)

我认为如果string使用slicers填充第二级,则需要

idx = pd.IndexSlice
df1 = df.loc[idx[:, '00:00:00'],:]
print (df1)
                       col
Date       Time           
2017-12-01 00:00:00  21.64
2017-12-02 00:00:00  22.78

df2 = df.loc[idx[:, '00:15:00':'03:00:00'], :]
print (df2)
                       col
Date       Time           
2017-12-01 00:15:00  21.72
           00:30:00  21.57
           00:45:00  21.47
           01:00:00  21.42
           01:15:00  21.44
           01:30:00  21.48
           01:45:00  21.32
           02:00:00  21.27
           02:15:00  21.29
           02:30:00  21.20
           02:45:00  21.18
           03:00:00  21.19
2017-12-02 00:15:00  22.67
           00:30:00  22.54
           00:45:00  22.55

另一个解决方案是按掩码过滤:

mask = df.index.get_level_values(1) =='00:00:00'
df1 = df[mask]
print (df1)
                       col
Date       Time           
2017-12-01 00:00:00  21.64
2017-12-02 00:00:00  22.78

df2 = df[~mask]

如果第二级是用于比较的python倍chane值:

import datetime

idx = pd.IndexSlice
df1 = df.loc[idx[:, datetime.time(0, 0)],:]
print (df1)
                       col
2017-12-01 00:00:00  21.64
2017-12-02 00:00:00  22.78

df2 = df.loc[idx[:, datetime.time(0, 15, 0):datetime.time(3, 0, 0)], :]
print (df2)
                       col
2017-12-01 00:15:00  21.72
           00:30:00  21.57
           00:45:00  21.47
           01:00:00  21.42
           01:15:00  21.44
           01:30:00  21.48
           01:45:00  21.32
           02:00:00  21.27
           02:15:00  21.29
           02:30:00  21.20
           02:45:00  21.18
           03:00:00  21.19
2017-12-02 00:15:00  22.67
           00:30:00  22.54
           00:45:00  22.55
mask = df.index.get_level_values(1) == datetime.time(0, 0)
df1 = df[mask]

df2 = df[~mask]