Question

我有一个数据文件，例如：

1  123  something else
2  234  something else
3  500  something else
.
. 
.
1  891  something else
2  234  something else
3  567  something else 
.
.
.

我正在尝试使用以下文件结尾：

1 1014
2  468
3 1067

即，如果第1列中的数字相同，则在第2列（或其他列）中添加数字。我相信将列读入嵌套列表并从那里继续前进，但我一直在努力。我尝试的另一种方法是使用我感兴趣的条目创建一个新文件：

for next in f.readlines():
    output.write(next[0:1] + "," + next[3:6]+ "\n")
    if not next:
        break

with open(output,"r") as file:
    data_list=[[int(x) for x in line.split(",")] for line in file]

print data_list

返回

[[1, 123], [2, 234], [3, 500], [1, 891], [2, 234], [3, 567]]

我想我可以遍历该列表并比较data_list [x] [0]并添加值，如果它们匹配，但这似乎不是一个优雅的解决方案。任何人都可以建议一个更优雅的方式吗？特别是，我一直在努力总结嵌套列表中的特定项目。

Answer 1

使用字典跟踪总和;使用collections.defaultdict()会使得在0处开始键变得更容易一些，如果以前没有看到过：

from collections import defaultdict

sums = defaultdict(int)

with open(filename) as f:
    for line in f:
        col1, col2, rest = line.split(None, 2)
        sums[col1] += int(col2)

这会读取你的初始文件，将空格分割2次以获得前两列，然后根据第一列对第二列求和：

>>> from collections import defaultdict
>>> sample = '''\
... 1  123  something else
... 2  234  something else
... 3  500  something else
... 1  891  something else
... 2  234  something else
... 3  567  something else 
... '''.splitlines()
>>> sums = defaultdict(int)
>>> for line in sample:
...     col1, col2, rest = line.split(None, 2)
...     sums[col1] += int(col2)
... 
>>> sums
defaultdict(<type 'int'>, {'1': 1014, '3': 1067, '2': 468})

嵌套列表中特定项的总和

1 个答案: