无法检查电子邮件的内容

时间:2019-07-11 17:14:43

标签: python-3.x

我正在尝试读取mbox文件的内容,并将其与从另一个文件中读取的单词列表进行比较。我相信问题是我读错了它们,因为输出与我期望知道的文件内容不符。

我尝试将它们读为rbr,但没有运气。然后,我尝试将txt文件放入list中。无论如何,mbox文件不能插入列表。作为进一步的测试,我尝试使用get_payload()函数读取电子邮件的内容,但是它返回的字节对我没有用。

# Opening the file that contains the balcklisted words and printing it 
with open("blacklist.txt",'r') as afile:
    buf=afile.read()
    print(buf)

# Opening the mbox files
mbox = mailbox.mbox('Andishe.mbox')

# To read the content of the mbox file when its a multiple messages
for message in mbox:
    if message.is_multipart():
        print ("from   :",message['from'])
        print ("to   :",message['to'])
        content = message.as_string()
        # print(content)
    else:
        print ("from   :",message['from'])
        print ("to   :",message['to'])
        content = message.as_string()
        # print(content)


# To check and see if the black listed words are inside the content of the email 
for file in content:
    if file in buf:
        print("file contains blacklisted words" + file)
    else:
        print("file does not contain blacklisted words")

我希望结果是这样的:

some black listed word
file contains blacklisted words + the black listed word

但是我陷入了不断打印的循环中,以下是打印内容的一部分:

file contains blacklisted wordsr
file contains blacklisted wordso
file contains blacklisted wordsm
file contains blacklisted words

我不知道这些rom代表什么或来自何处?

1 个答案:

答案 0 :(得分:0)

我弄清楚了我要去哪里错了

1-我读错了txt文件的内容。我应该使用这个:

    blacklist=[]
    for line in afile:
        blacklist.append(line.strip('\n'))

这样,我摆脱了租船合同的结尾,并且将每一行保持为一个单词

2-我也没有在for循环中做错,因为我没有附加mbox文件的内容。这解决了问题:

content_string = ''.join(content)
content_string = content_string.lower()
for word in blacklist:
    if word.lower() in content_string:
        print("This black listed word exists in content         : ",word)