在python dict中汇总时间序列数据

时间:2016-04-21 02:31:56

标签: python

我在python中有两个词典。time_data包含一系列日期时间对象。 size_data包含一系列日期时间对象的时间大小。两个词典都匹配strings作为键。 time_data包含每分钟的多个值。我想对每分钟的多个值求和,并将其显示为单个大小值。我该怎么做?

time_data = {}
size_data = {}

//Code to get new size and time `t` and `s`

time_data[match].append(t)
size_data[match].append(s)

示例数据值如下所示。

04:20:54 491
04:21:02 33
04:21:04 1063
04:21:04 1063
04:21:04 711
04:21:09 56
04:21:12 73
04:21:14 1066
04:21:14 931
04:21:18 618
04:21:18 51
04:21:22 27
04:21:24 1063
04:21:24 1063
04:21:24 535
04:21:33 24
04:21:33 1063
04:21:33 1063
04:21:33 978
04:21:43 36
04:21:45 1063
04:21:45 1063
04:21:45 755
04:21:53 27
04:21:55 1066
04:21:55 1063
04:21:55 711
04:22:03 30
04:22:05 1069
04:22:05 1063
04:22:05 1063
04:22:05 450
04:22:10 56
04:22:12 76
04:22:15 1066
04:22:15 1063
04:22:15 1066

值存储如下

time_data {  "字符串1":[04:22:10,04:22,11,04:22,11,04:22,11,04:22:12],  "字符串2":[04:22:10,04:22,11,04:22:13,04:22:13,04:22:13] }

size_data {  "字符串1":[491,33,55,1034,654]  "字符串2":[41,763,1055,104,454] }

我刚刚将上面的值作为列表发布。

1 个答案:

答案 0 :(得分:0)

更新答案:

由于你的问题不明确,下面我用HH:MM:SS作为关键,如果你想总结HH:MM的所有值,只需用第一个for循环中的t [:5]替换< / p>

$ cat test.py

time_data = { "string1":["04:22:10","04:22:11","04:22:11","04:22:11","04:22:12"], "string2":["04:22:10","04:22:11","04:22:13","04:22:13","04:22:13"] }

size_data ={ "string1":[491,33,55,1034,654], "string2":[41,763,1055,104,454] }

mydict = dict()

for key in time_data:
    timestamps = time_data.get(key)
    values = size_data.get(key)

    for t, v in zip(timestamps, values):
       mydict.setdefault(t, list()).append(v)


for t, vlist in mydict.items():
    print("key is {0}, value is {1}".format(t, sum(vlist)))

$ python test.py

key is 04:22:10, value is 532
key is 04:22:11, value is 1885
key is 04:22:13, value is 1613
key is 04:22:12, value is 654

原始答案:

您可以将hour:minute保存为关键字,对于同一hour:minute中的所有值,将它们存储在列表中并稍后对其进行总结。

cat sample.csv

04:20:23 109
04:21:18 51
04:21:22 27

$ cat test.py

mydict = dict()
with open("sample.csv") as inputs:
    for line in inputs:
        date_time, value = line.strip().split()
        hour_min = date_time[:5]
        mydict.setdefault(hour_min, list()).append(int(value))


for hour_min, vlist in mydict.items():
    print("key:'{0}', value:'{1}'".format(hour_min, sum(vlist)))

$ python test.py

key:'04:21', value:'78'
key:'04:20', value:'109'