我尝试通过搜索Google来调整我找到的这个脚本。 与以前收到的电子邮件完美配合,因为它直接提取“发件人”字段,我没有收到错误。
以下是我的代码:
#!/usr/bin/python
import imaplib
import sys
import email
import re
#FOLDER=sys.argv[1]
FOLDER='folder'
LOGIN='login@gmail.com'
PASSWORD='password'
IMAP_HOST = 'imap.gmail.com' # Change this according to your provider
email_list = []
email_unique = []
mail = imaplib.IMAP4_SSL(IMAP_HOST)
mail.login(LOGIN, PASSWORD)
mail.select(FOLDER)
result, data = mail.search(None, 'ALL')
ids = data[0]
id_list = ids.split()
for i in id_list:
typ, data = mail.fetch(i,'(RFC822)')
for response_part in data:
if isinstance(response_part, tuple):
msg = email.message_from_string(response_part[1])
sender = msg['reply-to'].split()[0]
address = re.sub(r'[<>]','',sender)
# Ignore any occurences of own email address and add to list
if not re.search(r'' + re.escape(LOGIN),address) and not address in email_list:
email_list.append(address)
print address
答案 0 :(得分:3)
正确的方法是使用标准库中的email.utils包中的parseaddr
,而不是乱搞字符串拆分和切片。它正确处理电子邮件标题中的各种合法地址格式。
一些例子:
>>> from email.utils import parseaddr
>>> parseaddr("sally@foo.com")
('', 'sally@foo.com')
>>> parseaddr("<sally@foo.com>")
('', 'sally@foo.com')
>>> parseaddr("Sally <sally@foo.com>")
('Sally', 'sally@foo.com')
>>> parseaddr("Sally Smith <sally@foo.com>")
('Sally Smith', 'sally@foo.com')
>>>
此外,您不应该假设电子邮件具有Reply-To标头。许多人没有。