熊猫的特定时间序列数据帧

时间:2020-05-06 22:59:40

标签: python pandas

我有一年这样的5分钟数据:

df = pd.DataFrame([['1/1/2019 00:05', 1], ['1/1/2019 00:10', 5],['1/1/2019 00:15', 1], ['1/1/2019 00:20',3], ['1/1/2019 00:25', 1],
                   ['1/1/2019 00:30', 2], ['1/1/2019 00:35', 6],['1/1/2019 00:40', 8],['1/1/2019 00:45', 1], ['1/1/2019 00:55', 2],
                   ['1/1/2019 01:00', 8],['1/1/2019 01:05', 1], ['1/1/2019 01:10', 5],['1/1/2019 01:15', 1], ['1/1/2019 01:20',3],['1/1/2019 01:25', 1],
                   ['1/1/2019 01:30', 2], ['1/1/2019 01:35', 6],['1/1/2019 01:40', 8],['1/1/2019 01:45', 1], ['1/1/2019 01:55', 2],
                   ['1/1/2019 02:00', 8]],
                  columns = ['Date','Value'])

并且我希望在所有相应的时间里每小时换一次。现在,每一行对应于特定日期和特定月份的一小时。像这样:

df = pd.DataFrame([['day1hour0month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3],  ['day1hour1month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3], 
                   ['day1hour2month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3], ['day1hour3month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3], 
                   ['day1hour4month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3], ['day1hour5month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3], 
                   ['day1hour6month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3], ['day1hour7month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3], 
                   ['day1hour8month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3], ['day1hour9month1', 1, 1, 3, 4, 1, 0, 1, 5, 2, 1, 3,3], 
                   ['day31hour23month12', 1, 1, 8, 0, 6, 5, 3, 1, 1, 2,3,5]],
                  columns = ['Date', 'min05', 'min10', 'min15', 'min20', 'min25', 
                             'min30', 'min35', 'min40', 'min45', 'min50',
                             'min55', 'min60'])

有什么方法可以使用熊猫的时间序列功能(不使用for循环)吗?对于执行此操作的任何建议,我将不胜感激。

提前谢谢!

干杯。

1 个答案:

答案 0 :(得分:1)

基于示例数据框:

In [2213]: df['Date'] = pd.to_datetime(df['Date'])
In [2191]: df1['dmh'] = 'day' + df.Date.dt.day.astype(str) + 'hour' + df.Date.dt.hour.astype(str) + 'month' + df.Date.dt.month.astype(str)

In [2199]: df['minute'] = 'min' + df.Date.dt.minute.astype(str)

In [2211]: df.pivot(index='dmh', columns='minute', values='Value')                                                                                                                                          
Out[2211]: 
minute           min0  min10  min15  min20  min25  min30  min35  min40  min45  min5  min55
dmh                                                                                       
day1hour0month1   NaN    5.0    1.0    3.0    1.0    2.0    6.0    8.0    1.0   1.0    2.0
day1hour1month1   8.0    5.0    1.0    3.0    1.0    2.0    6.0    8.0    1.0   1.0    2.0
day1hour2month1   8.0    NaN    NaN    NaN    NaN    NaN    NaN    NaN    NaN   NaN    NaN