我正在编写一个程序,该程序从电子邮件列表中获取发件人的电子邮件地址(保存到.txt文件),将其添加到词典中,并计算该电子邮件地址用于在该列表中发送电子邮件的次数。
该程序应该使用地址,将其存储在变量(单词)中,然后检查其是否用作字典键。如果不是,则将地址添加到字典中,val值为1。如果是,则将地址的值增加1。此后,程序将移至下一个地址并更新“ word”变量
问题在于,当程序检查地址是否在字典中时,它会递增地址的值,然后再重复8次。因此,每次检查地址时,它基本上都会将该地址的值增加9倍,然后再转到下一个地址。
我已经弄乱了一些代码,但是没有很多代码,因此我无能为力。如果有帮助,我正在处理嵌套的for循环。似乎唯一可以修改答案的是我如何处理识别程序应该注意的行。 (见下文)
问题出在这里:
#Iterate through each line of the file
for line in fhand:
#Focus only on the sender lines
if not line.startswith('From'): continue
#Turn every line into a list of strings
words = line.split()
#Iterate through the words in the strings for the sender's address
for word in words:
#if words[0] != 'From':continue
word = words[1]
print(word)
#Add address of the dictionary/ increment address's value
domain[word] = domain.get(word, 0) +1
代码从对每个地址进行9次迭代到如果我删除则对7次迭代
if not line.startswith('From'): continue
并使用
if words[0] != 'From':continue
试图摆脱嵌套循环:
for line in fhand:
#Focus only on the sender lines
#if not line.startswith('From'): continue
#Turn every line into a list of strings
words = line.split()
#test
if words[0] != 'From':continue
word = words[1]
#print(word)
domain[word] = domain.get(word, 0) +1
现在,字典值只是应有的两倍。
实际输出:
{'stephen.marquard@uct.ac.za': 4, 'louis@media.berkeley.edu': 6,
'zqian@umich.edu': 8, 'rjlowe@iupui.edu': 4, 'cwen@iupui.edu': 10,
'gsilver@umich.edu': 6, 'wagnermr@iupui.edu': 2, 'antranig@caret.cam.ac.uk':
2, 'gopal.ramasammycook@gmail.com': 2, 'david.horwitz@uct.ac.za': 8,
'ray@media.berkeley.edu': 2}
预期输出:
{'stephen.marquard@uct.ac.za': 2, 'louis@media.berkeley.edu': 3,
'zqian@umich.edu': 4, 'rjlowe@iupui.edu': 2, 'cwen@iupui.edu': 5,
'gsilver@umich.edu': 3, 'wagnermr@iupui.edu': 1, 'antranig@caret.cam.ac.uk':
1, 'gopal.ramasammycook@gmail.com': 1, 'david.horwitz@uct.ac.za': 4,
'ray@media.berkeley.edu': 1}
输入-我试图只复制并粘贴输入,但是太大了。可以在这里找到:mbox-short.txt
已更改
if not line.startswith('From'): continue
到
if not line.startswith('From:'): continue
添加该冒号以及摆脱嵌套循环,似乎已经修复了我的代码。
问题解决了。
答案 0 :(得分:0)
尝试一下:
#Iterate through each line of the file
for line in fhand:
#Focus only on the sender lines
if not line.startswith('From'): continue
#Turn every line into a list of strings
words = line.split()
#Iterate through the words in the strings for the sender's address
for word in words:
#if words[0] != 'From':continue
#Add address of the dictionary/ increment address's value
domain[word] = domain.get(word, 0) +1