Question

我认为我接近于我想要的东西，但我仍然是一个乞丐，所以不知道这是不是最好的方式。我们假设我们有一个包含数百行的文件，每个行的末尾都有一个我想要计算的值。在一行代码中对所有程序进行编程似乎很复杂，所以我更喜欢一步一步地进行编程。我们假设我们有一个包含以下行的文件：

Type of line 1: 10
Type of line 1: 5
Type of line 1: 15
Type of line 2: 50
Type of line 2: 25
Type of line 2: 5
Type of line 3: 1
Type of line 3: 14
Type of line 3: 2

由于存在各种类型的线，我想要获得的是那些出现在相同类型的线中的值的总和。例如，输出应为：

Type of line 1: 30
Type of line 2: 80
Type of line 3: 17

行的类型它只是一个字符串。

因此，为了实现这一点，我逐行阅读文件并使用＆＃39;分割每一行：＆＃39;字符。然后，我将这些分割的行保存在变量中，以便稍后调用其元素，并将这些值与相同类型的行相加。我知道，因为它是一个文件，其中行是字符串，以便使用值操作，它们必须被视为整数，所以它应该像int（y [1]），但我不确定。有没有建议我是否走在正确的道路上？这是我到目前为止所尝试的：

with open('file.txt','r') as f:
    for line in f:
        y = line.split(':')
        ...

Answer 1

您可以使用itertools.groupby按行值对行进行分组，然后对每行的结尾数字求和：

import itertools
import re
file_data = [i.strip('\n') for i in open('filename.txt')]
new_data = [[a, list(b)] for a, b in itertools.groupby(sorted(file_data, key=lambda x:re.findall('(?<=line\s)\d+', x)), key=lambda x:re.findall('(?<=line\s)\d+', x))]
final_results = ['Type of line {}: {}'.format(a, sum(int(re.findall('\d+$', i)[0]) for i in b)) for [a], b in new_data]

输出：

['Type of line 1: 30', 'Type of line 2: 80', 'Type of line 3: 17']

Answer 2

这是使用标准数据类型的基本答案，它可能不是最有效的方法，但它将帮助您学习python的基础知识

dict是中间数据结构的不错选择，因为你不能有多个同名的键。我们使用它来汇总你的行

    output = dict()  

    with open("file_name", "r") as file:
        for line in file.readlines(): 
            line_name, value = line.split(":")
            value.strip()  # Strip the new line character
            if line_name in output.keys():  # Test to see if we see this line before
                output[line_name] += int(value)  #  augmented addition operator
            else:
                output[line_name] = int(value) # line not found assign basic value

    for key, value in output.items():  # format the output in the way you wanted
        print("The sum of %s is %s" % (key, value))

输出：

The sum of Type of line 2 is 80
The sum of Type of line 1 is 30
The sum of Type of line 3 is 17

计算文本文件python

2 个答案: