我有一个txt文件,其中包含多封电子邮件。附加了包含多个电子邮件(.eml格式)的示例文本文件
From details
Return-Path: <emailaddress>
Delivered-To: email@address.com
Received: details
Received-SPF: details
Authentication-Results: details
Received: details
ARC-Seal: details
ARC-Message-Signature: details
Received: details
From: details
To: details
Subject: details
Thread-Topic: details
Thread-Index: details
Date: details
Message-ID:details
Accept-Language: en-US
Content-Language: en-US
Content-Type: multipart/mixed;
boundary="_004_DM5PR13MB138821372E6760B35B854B0CB74A0DM5PR13MB1388namp_"
MIME-Version: 1.0
details
--_004_DM5PR13MB138821372E6760B35B854B0CB74A0DM5PR13MB1388namp_
Content-Type: multipart/alternative;
boundary="_000_DM5PR13MB138821372E6760B35B854B0CB74A0DM5PR13MB1388namp_"
--_000_DM5PR13MB138821372E6760B35B854B0CB74A0DM5PR13MB1388namp_
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
test with FA
--_000_DM5PR13MB138821372E6760B35B854B0CB74A0DM5PR13MB1388namp_
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<html>
<head>
<body>
details
</body>
</html>
--_000_DM5PR13MB138821372E6760B35B854B0CB74A0DM5PR13MB1388namp_--
--_004_DM5PR13MB138821372E6760B35B854B0CB74A0DM5PR13MB1388namp_
Content-Type: details
Content-Description: details
Content-Disposition: attachment; filename="p2_eml.eml"; size=37836;
creation-date="Tue, 04 Aug 2020 10:48:34 GMT";
modification-date="Tue, 04 Aug 2020 10:48:34 GMT"
Content-Transfer-Encoding: base64
base64encoded data
--_004_DM5PR13MB138821372E6760B35B854B0CB74A0DM5PR13MB1388namp_--
From details <--- 2nd email starts --->
Return-Path: <emailaddress>
Delivered-To: email@address.com
Received: details
Received-SPF: details
Authentication-Results: details
Received: details
more details
通过python使用电子邮件库,它只能捕获第一封电子邮件,但不处理其余电子邮件。 但这会创建一个单一的msg对象,该对象引用txt文件中的第一封电子邮件。
有没有办法我可以从txt文件中提取所有电子邮件并逐个处理? msg = email.message_from_file(message)只会获取第一个电子邮件对象。不获取下一条消息obj。
尝试的代码:
msg = email.message_from_file(message)
# Dump extra information To, From, Date, Subject header values.
dump_extra_info(msg)
decoded_content_list = []
for part in msg.walk():
charset = part.get_content_charset();
if part.get_content_type() == "application/octet-stream":
logger.info("found content disposition returning ...")
continue
decoded_data = part.get_payload(decode=True)
if decoded_data and charset is not None:
utf8decoded = decoded_data.decode(charset)
decoded_content_list.append(utf8decoded)
return ' '.join(decoded_content_list)```
```