我有一个看起来有点像的电子邮件正文。
现在我想从中删除所有标题,并且只有对话电子邮件文本。我怎么能在python中做到这一点?
我试过了email.parser模块但是并没有给我我想要的结果。
请查看以下代码以获取更多信息。
import email
a="""--c66f5985-233d-4e89-b598-6398b60cbe00
Content-Type: multipart/alternative;
differences="Content-Type";
boundary="d5eff9f8-76b3-4320-adfb-1e51add8fa8f"
--d5eff9f8-76b3-4320-adfb-1e51add8fa8f
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
THis is a demo email body
Thanks And Regards,
Ana
"""
b = email.message_from_string(a)
if b.is_multipart():
for payload in b.get_payload():
# if payload.is_multipart(): ...
print (payload.get_payload())
else:
print (b.get_payload())
答案 0 :(得分:0)
import imaplib,email
hst = "your.host.adresse.com"
usr = "login"
pwd = "password"
imap = imaplib.IMAP4(hst)
try:
imap.login(usr, pwd)
except Exception as e:
raise IOError(e)
try:
imap.select("Inbox") # Tell Imap where to go
result, data = imap.uid('search', None, "ALL")
latest = data[0].split()[-1]
result, data = imap.uid('fetch', latest, '(RFC822)')
a = data[0][1] # This contains the Mail Data
except Exception as e:
raise IOError(e)
b = email.message_from_string(a)
if b.is_multipart():
for payload in b.get_payload():
b = (payload.get_payload())
else:
b = (b.get_payload())
print b
这会删除您在最终文本中不想要的邮件中的所有内容。我已用您的代码对此进行了测试。你没有显示你如何导入邮件(你的a
),所以我想你可以从中获得解码问题。
如果您在使用HTML邮件时遇到任何问题:
from bs4 import BeautifulSoup
soup = BeautifulSoup(b, 'html.parser')
soup = soup.get_text()
print soup
现在应该完成这项工作,但我建议你将默认的python解析器更改为lxml或html5lib。