在python中计算文件的某些值

时间:2017-12-18 13:17:47

标签: python python-2.7 file

我有一个这样的文本文件(这是一个示例,实际文件非常大):

[52639 - 2017-12-08 11:56:58,680] INFO __main__.master 251 Finished pre-smap protein tag ('4h02', [], 35000, 665, '67')
[52639 - 2017-12-08 11:57:37,686] INFO __main__.master 251 Finished pre-smap protein tag ('4nqk', [], 35000, 223, '18')
[52639 - 2017-12-08 11:58:46,984] INFO __main__.master 251 Finished pre-smap protein tag ('3j60', [], 3500, 1052, '65')
[52639 - 2017-12-08 12:01:10,073] INFO __main__.master 251 Finished pre-smap protein tag ('4ddg', [], 35000, 541, '38')
[52639 - 2017-12-08 12:03:37,570] INFO __main__.master 251 Finished pre-smap protein tag ('4ksl', [], 35000, 1303, '68')

我想计算最后一个逗号之前的值。结果将是665 + 223 + 1052 + 541 + 1303 = 3784。

我无法弄清楚如何实现这个目标。任何帮助将不胜感激。

2 个答案:

答案 0 :(得分:0)

在这里,你可以尝试这个:

summation = 0

with open("test.txt", "r") as infile:
    for line in infile:
        newLine = line.split(", ")
        summation = summation + int(newLine[3])

print(summation)

输出:

3784

test.txt文件的内容结构如下:

[52639 - 2017-12-08 11:56:58,680] INFO main.master 251 Finished pre-smap protein tag ('4h02', [], 35000, 665, '67')
[52639 - 2017-12-08 11:57:37,686] INFO main.master 251 Finished pre-smap protein tag ('4nqk', [], 35000, 223, '18')
[52639 - 2017-12-08 11:58:46,984] INFO main.master 251 Finished pre-smap protein tag ('3j60', [], 3500, 1052, '65')
[52639 - 2017-12-08 12:01:10,073] INFO main.master 251 Finished pre-smap protein tag ('4ddg', [], 35000, 541, '38')
[52639 - 2017-12-08 12:03:37,570] INFO main.master 251 Finished pre-smap protein tag ('4ksl', [], 35000, 1303, '68')

如果您希望打印所有汇总的号码,您可以使用列表存储每个号码:

summation = 0
coefficients = []

with open("test.txt", "r") as infile:
    for line in infile:
        newLine = line.split(", ")
        coefficients.append(newLine[3])
        summation = summation + int(newLine[3])

print("+".join(coefficients), end="=")
print(summation)

输出:

665+223+1052+541+1303=3784

答案 1 :(得分:0)

import re
s = """
[52639 - 2017-12-08 11:56:58,680] INFO main.master 251 Finished pre-smap protein tag ('4h02', [], 35000, 665, '67')

[52639 - 2017-12-08 11:57:37,686] INFO main.master 251 Finished pre-smap protein tag ('4nqk', [], 35000, 223, '18')

[52639 - 2017-12-08 11:58:46,984] INFO main.master 251 Finished pre-smap protein tag ('3j60', [], 3500, 1052, '65')

[52639 - 2017-12-08 12:01:10,073] INFO main.master 251 Finished pre-smap protein tag ('4ddg', [], 35000, 541, '38')

[52639 - 2017-12-08 12:03:37,570] INFO main.master 251 Finished pre-smap protein tag ('4ksl', [], 35000, 1303, '68')
"""

pattern = ', ([0-9]*), \'[0-9]*\'\)'

print sum(int(i) for i in re.findall(pattern,s))

您是否尝试过使用正则表达式库?通过构建匹配“用括号括起的数字之前的数字”的模式,您可以捕获所有这些数字,然后构建一个将它们转换为整数的生成器,并将它们相加。