Python,提取电子邮件内容

时间:2018-03-02 20:23:32

标签: python python-2.7 html-email

我编写了此代码,以便使用来自gmail帐户的imaplib从电子邮件中获取内容。到目前为止,我已经达到了这样一种程度,你可以看到这种形式的内容。

这是代码,

import imaplib
from email.parser import HeaderParser


user = "email"
password =  "pass"

mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login(user, password)
mail.list()
mail.select('inbox')

result, data = mail.search(None, "ALL")

ids = data[0] # data is a list.
id_list = ids.split() # ids is a space separated string
latest_email_id = id_list[-1] # get the latest
result, data = mail.fetch(latest_email_id, '(BODY.PEEK[TEXT])') # fetch the email body for the given ID

header_data = data[0][1] # here's the body, which is raw text of the whole email

parser = HeaderParser()
msg = parser.parsestr(header_data)


print msg

运行此代码时我最终得到的是

From nobody Fri Mar 02 14:58:48 2018

--089e0823549c5eeef8056673549d
Content-Type: text/plain; charset="UTF-8"

this is the body

--089e0823549c5eeef8056673549d
Content-Type: text/html; charset="UTF-8"

<div dir="ltr">this is the body</div>

--089e0823549c5eeef8056673549d--

我尝试使用beautifulsoup4并且没有成功,我怎样才能从正在返回的内容中取出<div dir="ltr">this is the body</div>中的正文并将其保存到变量中。另外,还有另一种方法吗?

提前致谢

0 个答案:

没有答案