Question

我正在编写一个程序，该程序从电子邮件列表中获取发件人的电子邮件地址（保存到.txt文件），将其添加到词典中，并计算该电子邮件地址用于在该列表中发送电子邮件的次数。

该程序应该使用地址，将其存储在变量（单词）中，然后检查其是否用作字典键。如果不是，则将地址添加到字典中，val值为1。如果是，则将地址的值增加1。此后，程序将移至下一个地址并更新“ word”变量

问题在于，当程序检查地址是否在字典中时，它会递增地址的值，然后再重复8次。因此，每次检查地址时，它基本上都会将该地址的值增加9倍，然后再转到下一个地址。

我已经弄乱了一些代码，但是没有很多代码，因此我无能为力。如果有帮助，我正在处理嵌套的for循环。似乎唯一可以修改答案的是我如何处理识别程序应该注意的行。（见下文）

问题出在这里：

#Iterate through each line of the file
for line in fhand:
#Focus only on the sender lines
    if not line.startswith('From'): continue
#Turn every line into a list of strings
    words = line.split()

#Iterate through the words in the strings for the sender's address
    for word in words:
        #if words[0] != 'From':continue
        word = words[1]
        print(word)

#Add address of the dictionary/ increment address's value
        domain[word] = domain.get(word, 0) +1

代码从对每个地址进行9次迭代到如果我删除则对7次迭代

if not line.startswith('From'): continue

并使用

if words[0] != 'From':continue

试图摆脱嵌套循环：

for line in fhand:
#Focus only on the sender lines
    #if not line.startswith('From'): continue
#Turn every line into a list of strings
    words = line.split()

#test
    if words[0] != 'From':continue
    word = words[1]
    #print(word)
    domain[word] = domain.get(word, 0) +1

现在，字典值只是应有的两倍。

实际输出：

{'stephen.marquard@uct.ac.za': 4, 'louis@media.berkeley.edu': 6, 
'zqian@umich.edu': 8, 'rjlowe@iupui.edu': 4, 'cwen@iupui.edu': 10, 
'gsilver@umich.edu': 6, 'wagnermr@iupui.edu': 2, 'antranig@caret.cam.ac.uk': 
2, 'gopal.ramasammycook@gmail.com': 2, 'david.horwitz@uct.ac.za': 8, 
'ray@media.berkeley.edu': 2}

预期输出：

{'stephen.marquard@uct.ac.za': 2, 'louis@media.berkeley.edu': 3,
 'zqian@umich.edu': 4, 'rjlowe@iupui.edu': 2, 'cwen@iupui.edu': 5, 
'gsilver@umich.edu': 3, 'wagnermr@iupui.edu': 1, 'antranig@caret.cam.ac.uk':
 1, 'gopal.ramasammycook@gmail.com': 1, 'david.horwitz@uct.ac.za': 4, 
'ray@media.berkeley.edu': 1}

输入-我试图只复制并粘贴输入，但是太大了。可以在这里找到：mbox-short.txt

已更改

if not line.startswith('From'): continue

到

if not line.startswith('From:'): continue

添加该冒号以及摆脱嵌套循环，似乎已经修复了我的代码。

问题解决了。

Answer 1

尝试一下：

#Iterate through each line of the file
for line in fhand:
#Focus only on the sender lines
    if not line.startswith('From'): continue

#Turn every line into a list of strings
    words = line.split()    
#Iterate through the words in the strings for the sender's address
    for word in words:
        #if words[0] != 'From':continue    
#Add address of the dictionary/ increment address's value
        domain[word] = domain.get(word, 0) +1

“ word”变量在更新之前循环循环并计数相同的字符串9x

1 个答案: