使用python从包含多个电子邮件的文本文件中读取所有电子邮件

时间:2020-08-04 08:15:35

标签: python email

我有一个txt文件,其中包含多封电子邮件。附加了包含多个电子邮件(.eml格式)的示例文本文件

From details
Return-Path: <emailaddress>
Delivered-To: email@address.com
Received: details
Received-SPF: details
Authentication-Results: details
Received: details
ARC-Seal: details
ARC-Message-Signature: details
Received: details
From: details
To: details
Subject: details
Thread-Topic: details
Thread-Index: details
Date: details
Message-ID:details
Accept-Language: en-US
Content-Language: en-US
Content-Type: multipart/mixed;
boundary="_004_DM5PR13MB138821372E6760B35B854B0CB74A0DM5PR13MB1388namp_"
MIME-Version: 1.0
details
--_004_DM5PR13MB138821372E6760B35B854B0CB74A0DM5PR13MB1388namp_
Content-Type: multipart/alternative;
boundary="_000_DM5PR13MB138821372E6760B35B854B0CB74A0DM5PR13MB1388namp_"
--_000_DM5PR13MB138821372E6760B35B854B0CB74A0DM5PR13MB1388namp_
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
test with FA

--_000_DM5PR13MB138821372E6760B35B854B0CB74A0DM5PR13MB1388namp_
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<html>
<head>
<body>
details
</body>
</html>

--_000_DM5PR13MB138821372E6760B35B854B0CB74A0DM5PR13MB1388namp_--

--_004_DM5PR13MB138821372E6760B35B854B0CB74A0DM5PR13MB1388namp_
Content-Type: details
Content-Description: details
Content-Disposition: attachment; filename="p2_eml.eml"; size=37836;
creation-date="Tue, 04 Aug 2020 10:48:34 GMT";
modification-date="Tue, 04 Aug 2020 10:48:34 GMT"
Content-Transfer-Encoding: base64
base64encoded data

--_004_DM5PR13MB138821372E6760B35B854B0CB74A0DM5PR13MB1388namp_--

From details <--- 2nd email starts --->
Return-Path: <emailaddress>
Delivered-To: email@address.com
Received: details
Received-SPF: details
Authentication-Results: details
Received: details
more details

通过python使用电子邮件库,它只能捕获第一封电子邮件,但不处理其余电子邮件。 但这会创建一个单一的msg对象,该对象引用txt文件中的第一封电子邮件。

有没有办法我可以从txt文件中提取所有电子邮件并逐个处理? msg = email.message_from_file(message)只会获取第一个电子邮件对象。不获取下一条消息obj。

尝试的代码:

    msg = email.message_from_file(message)

    # Dump extra information To, From, Date, Subject header values.
    dump_extra_info(msg)

    decoded_content_list = []
    for part in msg.walk():
        charset = part.get_content_charset();
        if part.get_content_type() == "application/octet-stream":
            logger.info("found content disposition returning ...")
            continue
        decoded_data = part.get_payload(decode=True)
        if decoded_data and charset is not None:
            utf8decoded = decoded_data.decode(charset)
            decoded_content_list.append(utf8decoded)
    return ' '.join(decoded_content_list)```
    ```

0 个答案:

没有答案