Question

我从.txt文件中读取了下面的数据将其保存到Python List对象中下面是该列表对象中行的一些输出。

Year Month OtherValue
(1977, 10) 52
(1843, 9) 0
(1946, 6) 83
(1891, 3) 11
(2001, 5) 69
(1868, 7) 27
(1916, 9) 20
(1871, 10) 60
(1845, 3) 46
(1919, 12) 26
(1832, 8) 0
(1880, 2) 23
(1933, 8) 0
(2007, 1) 20
(1930, 11) 51
(1920, 3) 20

...

我需要按年分组，然后按月分组。然后计算新列中的月平均值。将写入年，月和月平均值使用以下格式的新.txt文件：

Year Month Averages
2011 01    34.875
2011 02    29.897
2011 03    13.909
....

请告知。

Answer 1

使用defaultdict：

示例：

>>> from collections import defaultdict >>> lis = [(2011, 1, 50), (2012, 1, 5), (2011, 1, 35), (2012, 1, 15), (2013, 5, 37), (2011, 3, 45)] >>> dic = defaultdict(lambda :defaultdict(list)) >>> for year, month, val in lis: dic[year][month].append(val) ... >>> dic defaultdict(<function <lambda> at 0x896afb4>, {2011: defaultdict(<type 'list'>, {1: [50, 35], 3: [45]}), 2012: defaultdict(<type 'list'>, {1: [5, 15]}), 2013: defaultdict(<type 'list'>, {5: [37]})})

2011年第一个月的平均值：

>>> sum(dic[2011][1])/float(len(dic[2011][1])) 42.5

Answer 2

您可以使用list.sort()将列表中的值按日期排序，然后使用itertools.groupby对其进行分组。 groupby返回迭代器而不是列表（因此它们没有len这一事实使得它变得有点棘手。但是你可以使用列表理解来获得你需要的值容易：

from itertools import groupby
from operator import itemgetter

data = [(1977, 10, 52), (1977, 11, 20), (1977, 10, 0)] # example data

key = itemgetter(0, 1) # a callable to get year and month from data values
data.sort(key=key)
groups = [(date, [value for y, m, value in group]) for date, group in groupby(data, key)]
averages = [date + (sum(values) / len(values),) for date, values in groups]

# for example data, averages will be [(1977, 10, 26.0), (1977, 11, 20.0)]

如何在Python中对项目进行分组，以便我可以获取这些分组项目的平均值？

2 个答案: