我编写了此代码,以便使用来自gmail帐户的imaplib从电子邮件中获取内容。到目前为止,我已经达到了这样一种程度,你可以看到这种形式的内容。
这是代码,
import imaplib
from email.parser import HeaderParser
user = "email"
password = "pass"
mail = imaplib.IMAP4_SSL('imap.gmail.com')
mail.login(user, password)
mail.list()
mail.select('inbox')
result, data = mail.search(None, "ALL")
ids = data[0] # data is a list.
id_list = ids.split() # ids is a space separated string
latest_email_id = id_list[-1] # get the latest
result, data = mail.fetch(latest_email_id, '(BODY.PEEK[TEXT])') # fetch the email body for the given ID
header_data = data[0][1] # here's the body, which is raw text of the whole email
parser = HeaderParser()
msg = parser.parsestr(header_data)
print msg
运行此代码时我最终得到的是
From nobody Fri Mar 02 14:58:48 2018
--089e0823549c5eeef8056673549d
Content-Type: text/plain; charset="UTF-8"
this is the body
--089e0823549c5eeef8056673549d
Content-Type: text/html; charset="UTF-8"
<div dir="ltr">this is the body</div>
--089e0823549c5eeef8056673549d--
我尝试使用beautifulsoup4并且没有成功,我怎样才能从正在返回的内容中取出<div dir="ltr">this is the body</div>
中的正文并将其保存到变量中。另外,还有另一种方法吗?
提前致谢