如何在Pandas时间序列中正确获取子集的总和?

时间:2016-03-30 07:06:41

标签: python pandas

我有这个时间序列(raw_series):

2016-03-30 00:01:00    2
2016-03-30 04:54:00    4
2016-03-30 08:51:00    1
2016-03-30 08:54:00    0
2016-03-30 08:55:00    1
2016-03-30 08:56:00    1
2016-03-30 08:57:00    2
2016-03-30 08:58:00    0
2016-03-30 09:00:00    1
2016-03-30 09:01:00    0
2016-03-30 09:04:00    0
2016-03-30 09:05:00    7
2016-03-30 09:06:00    4
2016-03-30 09:22:00    0
2016-03-30 09:24:00    8
2016-03-30 09:25:00    3
2016-03-30 09:28:00    0
2016-03-30 09:29:00    0
2016-03-30 09:39:00    1
2016-03-30 09:40:00    1
2016-03-30 09:41:00    1

我想在09:00和08:00计算值的总和。这就是我的工作(但不会工作)

now = datetime.now()
try:
    this_hour = raw_series[datetime(now.year, now.month, now.day, now.hour)].sum()
except KeyError:
    this_hour = 0

prev = now - timedelta(hours=1)
try:
    prev_hour = raw_series[datetime(prev.year, prev.month, prev.day, prev.hour)].sum()
except KeyError:
    prev_hour = 0

我运行程序now的时间是(从调试输出文件中复制):

[30/Mar/2016 09:59:45] DEBUG [main.views:267] now is 2016-03-30 09:59:41.318779

结果是:this_hour = 1.0prev_hour = 0(例外)

我做错了什么?

1 个答案:

答案 0 :(得分:1)

IIUC您可以使用pd.to_datetime将索引转换为datetime.index,然后使用掩码数小时:

s = pd.Series([2, 4, 1, 0, 1, 1, 2, 0, 1, 0, 0, 7, 4, 0, 8, 3, 0, 0, 1, 1, 1], index=['2016-03-30 00:01:00', '2016-03-30 04:54:00', '2016-03-30 08:51:00', '2016-03-30 08:54:00', '2016-03-30 08:55:00', '2016-03-30 08:56:00', '2016-03-30 08:57:00', '2016-03-30 08:58:00', '2016-03-30 09:00:00', '2016-03-30 09:01:00', '2016-03-30 09:04:00', '2016-03-30 09:05:00', '2016-03-30 09:06:00', '2016-03-30 09:22:00', '2016-03-30 09:24:00', '2016-03-30 09:25:00', '2016-03-30 09:28:00', '2016-03-30 09:29:00', '2016-03-30 09:39:00', '2016-03-30 09:40:00', '2016-03-30 09:41:00'])

s.index = pd.to_datetime(s.index)
cur_hour = 9
prev_hour = cur_hour - 1
res1 = s[s.index.hour == cur_hour].sum()
res2 = s[s.index.hour == prev_hour].sum()

In [57]: res1
Out[57]: 26

In [58]: res2
Out[58]: 5