我正在编写一个代码,我想从mbox.text文件中搜索术语“X-DSPAM-Confidence:0.8475”。到目前为止,我可以搜索字符串并计算它在文件中出现的次数。现在的问题是,每次它出现在文本文件中时,我都要添加该字符串的结束数字(此处为0.8475)。我需要帮助,因为我卡在那里,无法计算浮点数的总和出现在该字符串的末尾。
我的文件内容如下:
X-Content-Type-Message-Body: text/plain; charset=UTF-8
Content-Type: text/plain; charset=UTF-8
X-DSPAM-Result: Innocent
X-DSPAM-Processed: Sat Jan 5 09:14:16 2008
X-DSPAM-Confidence: 0.8475
X-DSPAM-Probability: 0.0000
我的代码:
text_file = raw_input ("please enter the path of the file that you want to open:")
open_file = open ( text_file )
print "Text file has been open "
count = 0
total = 0.00000
for line in open_file:
if 'X-DSPAM-Confidence:' in line:
total =+ float(line[20:])
count = count + 1
print total/count
print "The number of line with X-DSPAM-Confidence: is:", count
我该怎么做?
答案 0 :(得分:0)
切片返回一个列表而不是一个值,就地运算符用于添加+=
而不是=+
。话虽如此,你应该使用split
。
total = 0.00000
for line in open_file:
if 'X-DSPAM-Confidence:' in line:
total += float(line.split()[-1]) # change here.
count = count + 1
print total/count
甚至可以更好地使用sum
和len
。
with open('test.txt') as f:
data = [float(line.split()[-1]) for line in f if line.strip().startswith('X-DSPAM-Confidence:')]
print(sum(data)/len(data))
使用mean
模块中的statistics
的Python 3.4或更新的解决方案。
from statistics import mean
with open('test.txt') as f:
data = [float(line.split()[-1]) for line in f if line.strip().startswith('X-DSPAM-Confidence:')]
print(mean(data))
答案 1 :(得分:0)
print
声明,就像一个神奇的8球,告诉所有
>>> print repr(line[20:])
' 0.0000\n'
你可以选择比float
更多的位置。把它缩小一点
total += float(line[21:-1])