我怎样才能将时间序列数据从昨天移到今天的大熊猫?

时间:2018-02-05 08:32:53

标签: python pandas time-series move

我的数据帧超过了当天(12-02~1203)...我希望每天将数据(12-02 22:00~00:00)移动到今天的数据(12-03)。日期/时间是多重索引的。当我分析数据时,这是必需的,它每天都更方便。但现在我需要分析数据,包括昨天最后2小时...所以我需要这个数据帧操作。

..
 date         time       a     b 
2015-12-02  21:00:00    23.97   0
2015-12-02  21:15:00    24.06   0
2015-12-02  21:30:00    24.03   0
2015-12-02  21:45:00    23.99   0
2015-12-02  22:00:00    24.03   0
2015-12-02  22:15:00    23.89   0
2015-12-02  22:30:00    23.71   0
2015-12-02  22:45:00    23.64   0
2015-12-02  23:00:00    23.29   0
2015-12-02  23:15:00    23.8    0
2015-12-02  23:30:00    23.82   0
2015-12-02  23:45:00    23.86   0
2015-12-03  0:00:00 23.66   0
2015-12-03  0:15:00 23.64   0
2015-12-03  0:30:00 23.7    0
2015-12-03  0:45:00 23.69   0
2015-12-03  1:00:00 23.65   0
2015-12-03  1:15:00 23.48   0
2015-12-03  1:30:00 23.45   0
..

结果应如下(12-02 22:00~23:45数据移至12-03我该怎么办?

..
2015-12-02  21:00:00    23.97   0
2015-12-02  21:15:00    24.06   0
2015-12-02  21:30:00    24.03   0
2015-12-02  21:45:00    23.99   0
2015-12-03  22:00:00    24.03   0
2015-12-03  22:15:00    23.89   0
2015-12-03  22:30:00    23.71   0
2015-12-03  22:45:00    23.64   0
2015-12-03  23:00:00    23.29   0
2015-12-03  23:15:00    23.8    0
2015-12-03  23:30:00    23.82   0
2015-12-03  23:45:00    23.86   0
2015-12-03  0:00:00 23.66   0
2015-12-03  0:15:00 23.64   0
2015-12-03  0:30:00 23.7    0
2015-12-03  0:45:00 23.69   0
2015-12-03  1:00:00 23.65   0
2015-12-03  1:15:00 23.48   0
2015-12-03  1:30:00 23.45   0
..

3 个答案:

答案 0 :(得分:2)

我认为你需要:

from datetime import date, datetime, time, timedelta

m = df.index.get_level_values(1) < time(22,0,0)
idx1 = df.index.get_level_values(0)
idx2 = df.index.get_level_values(1)
df.index = [idx1.where(m, idx1 +  timedelta(days=1)), idx2]

print (df)
                         a  b
date       time              
2015-12-02 21:00:00  23.97  0
           21:15:00  24.06  0
           21:30:00  24.03  0
           21:45:00  23.99  0
2015-12-03 22:00:00  24.03  0
           22:15:00  23.89  0
           22:30:00  23.71  0
           22:45:00  23.64  0
           23:00:00  23.29  0
           23:15:00  23.80  0
           23:30:00  23.82  0
           23:45:00  23.86  0
           00:00:00  23.66  0
           00:15:00  23.64  0
           00:30:00  23.70  0
           00:45:00  23.69  0
           01:00:00  23.65  0
           01:15:00  23.48  0
           01:30:00  23.45  0

答案 1 :(得分:1)

这种方式应该是有效的。首先提取每次的小时,然后在一天内增加一小时> = 22。

import pandas as pd
from datetime import timedelta

df['hour'] = pd.to_datetime(df['time'], format='%H:%M:%S').dt.hour
df.loc[df['hour'] >=22, 'date'] = df['date'] +  timedelta(days=1)

答案 2 :(得分:0)

我不确定这是否是最快的方式,但您可以考虑使用np.where

import numpy as np
import pandas as pd

df["date"] = pd.to_datetime(df["date"])
offset = pd.DateOffset(days=1)
df["date"] = np.where((df["time"]>="22:00") & (df["time"]<="23:45" ),
                      df["date"] + offset,
                      df["date"])