Pandas(Python) - 将数据分段为时间帧

时间:2017-01-20 16:26:48

标签: python pandas

我有一个DateTime(索引)的数据框和一个用电量的样本:

DateTime          Usage
01-Jan-17 12am    10
01-Jan-17 3am     5
01-Jan-17 6am     15
01-Jan-17 9am     40
01-Jan-17 12pm    60
01-Jan-17 3pm     62
01-Jan-17 6pm     45
01-Jan-17 9pm     18
02-Jan-17 12am    11
02-Jan-17 3am     4
02-Jan-17 6am     17
02-Jan-17 9am     37
02-Jan-17 12pm    64
02-Jan-17 3pm     68
02-Jan-17 6pm     41
02-Jan-17 9pm     16

实际上,这个系列要长得多。我想比较一天中的时间段,这样我就可以看一下时间序列的每日季节性。熊猫有没有办法分割数据,以便你可以比较这些时间序列?我想,生成的DataFrame看起来像是:

Time    1-Jan   2-Jan
12am    10      11
3am     5       4
6am     15      17
9am     40      37
12pm    60      64
3pm     62      68
6pm     45      41
9pm     18      16

谢谢!

1 个答案:

答案 0 :(得分:1)

假设您有DateTimestr数据类型,您可以将其拆分为DateTime然后转动它:

df[['Date', 'Time']] = df.DateTime.str.split(" ", expand=True)
df1 = df.pivot("Time", "Date", "Usage").reset_index()

enter image description here

如何对Time列进行排序?实际上并非如此直截了当,为此,我们需要从时间,小时,PM / AM指标以及小时为12时提取一些列,因为12应该高于所有其他小时:

# use regex to extract Hour (numeric part of Time) and AM/PM indicator
hourInd = df1.Time.str.extract("(?P<Hour>\d+)(?P<Ind>[pa]m)", expand=True)

# convert the hour column to integer and create another column to check if hour is 12
# then sort by AM/PM indicator, IsTwelve and Hour and get the index to reorder the original 
# data frame
df1.loc[(hourInd.assign(Hour = hourInd.Hour.astype(int), IsTwelve = hourInd.Hour != "12")
         .sort_values(["Ind", "IsTwelve", "Hour"]).index)]

enter image description here