Python电子邮件解析非多部分

时间:2016-03-15 09:06:48

标签: python email parsing

我有一个解析原始电子邮件的脚本。它适用于多部分电子邮件,但如何解析非多部分电子邮件?

mail = email.message_from_string(raw_message)
if mail.is_multipart():
    data = extract(mail)
else:
    payload = mail.get_payload(decode=True)

原始电子邮件:

Return-Path: <>
X-Original-To: bounces@mydomain.com
Delivered-To: bounces@mydomain.com
Received: from inmumg01.tcs.com (inmumg01.tcs.com [219.64.33.12])
    by smtp.mydomain.com (Postfix) with ESMTP id 603693FE11
    for <bounces@mydomain.com>; Tue, 15 Mar 2016 04:39:36 -0400 (EDT)
Received: from localhost by inmumg01.tcs.com;
  15 Mar 2016 14:09:38 +0530
Message-Id: <5aaa80$2543de@inmumg01.tcs.com>
Date: 15 Mar 2016 14:09:38 +0530
To: bounces@mydomain.com
From: "Mail Delivery System" <mail.notification@tcs.com>
Subject: Undeliverable Message

The following message to <vipul4.j@tcs.com> was undeliverable.
The reason for the problem:
5.1.0 - Unknown address error 550-'vipul4.j@tcs.com... No such user'

The IP address of the MTA to which the message could not be sent:
172.17.9.35

---------- A copy of the message begins below this line ----------
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0CGBgCEyedW/3zwO4tdHgEBAg4BgklMUm2nXoJekBMBDYFmBxUFAQ2HGwI4FAEBAQEBAQFkJ4RLIAoTAQEECCwGSQMBCQICMTsFHASHJ10FCatgZ4RBAQSLKQaBD4REgkIBhlERAWqCNBOBJ5MJhEuBLwKEPogSAoIuhnOFYo1YgUUBAUKBNgwBgj5OB4kqgTIBAQE
X-IPAS-Result: A0CGBgCEyedW/3zwO4tdHgEBAg4BgklMUm2nXoJekBMBDYFmBxUFAQ2HGwI4FAEBAQEBAQFkJ4RLIAoTAQEECCwGSQMBCQICMTsFHASHJ10FCatgZ4RBAQSLKQaBD4REgkIBhlERAWqCNBOBJ5MJhEuBLwKEPogSAoIuhnOFYo1YgUUBAUKBNgwBgj5OB4kqgTIBAQE
X-IronPort-AV: E=Sophos;i="5.24,338,1454956200";
   d="scan'208,217";a="72486315"
X-Amp-Result: Clean
X-Amp-File-Uploaded: False
Received: from smtp.mydomain.com ([139.59.240.124])
  by inmumg01.tcs.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 15 Mar 2016 14:09:37 +0530
Received: from 128.199.202.14 (unknown [128.199.202.14])
    (Authenticated sender: mailsender)
    by smtp.mydomain.com (Postfix) with ESMTPA id 0D41F3FE11
    for <vipul4.j@tcs.com>; Tue, 15 Mar 2016 04:39:33 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kitemailer.com;
    s=kitemail; t=1458031173;
    bh=1KFZSL77mNYuQ3iTjpNMdGcBOp2a4pGQnLVlq49ZrGg=;
    h=Date:From:To:Subject:List-Unsubscribe;
    b=Xxaf++WE0B7HL+FN28O76df7gYNEIKzk8eE9VpxrnMBCpGWPKWBMMfVDfCyie3NBJ
     GJiMxn/Yhn+ey6Mr5R5AK5JO5n72yWlytLm0RepMEydaeHHVQPx7bE+LMDMlORSFin
     bWdnz58lNMuZ3w9qtqjCXt22Sk5yXfCO71tRgfus=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=mydomain.com;
    s=kitemail; t=1458031173;
    bh=1KFZSL77mNYuQ3iTjpNMdGcBOp2a4pGQnLVlq49ZrGg=;
    h=Date:From:To:Subject:List-Unsubscribe;
    b=FiGdkSE9LCjYkfYyWq65GbZoMZVCQs5OXXJA35CyGQtjPWbvwIKvx7Z6Ff39EBRLf
     Vu+6PUrvwyZLFh/1CW0NGOHDgUDjWWQ2jHfnNpJ9QEbHgOwomuMty10HDeZnIr0zM7
     8mFCgeCbiiyusQkhmXh5aYqqD+Q/1wFcrpLpkBZc=
Date: Tue, 15 Mar 2016 04:39:31 -0400 (EDT)
From: Kitemailer Newsletter <info@kitemailer.com>
To: vipul4.j@tcs.com
Message-ID: <15106466-1b13-4d64-b220-15f05f4815b7-1458031171312@smtp.mydomain.com>
Subject: KiteMailer | New Features this Week
MIME-Version: 1.0
Content-Type: multipart/mixed;
    boundary="----=_Part_44_1398250960.1458031171306"
List-Unsubscribe: <http://example.com/unsubscribe/dmlwdWw0LmpAdGNzLmNvbSM5Ng==>
Feedback-ID: 19:96:1520615:MyDomain

现在在else语句中,我想提取信息,如果我尝试payload['to']它会抛出错误TypeError: string indices must be integers, not str

1 个答案:

答案 0 :(得分:0)

好吧,让我们说你无法用邮件库(我不知道)做到这一点,你可以将原始邮件转换成字典并获取你的元素:

这是你的原始信息:

raw_message='''Return-Path: <>
X-Original-To: bounces@mydomain.com
Delivered-To: bounces@mydomain.com
Received: from inmumg01.tcs.com (inmumg01.tcs.com [219.64.33.12])
    by smtp.mydomain.com (Postfix) with ESMTP id 603693FE11
    for <bounces@mydomain.com>; Tue, 15 Mar 2016 04:39:36 -0400 (EDT)
Received: from localhost by inmumg01.tcs.com;
  15 Mar 2016 14:09:38 +0530
Message-Id: <5aaa80$2543de@inmumg01.tcs.com>
Date: 15 Mar 2016 14:09:38 +0530
To: bounces@mydomain.com
From: "Mail Delivery System" <mail.notification@tcs.com>
Subject: Undeliverable Message

The following message to <vipul4.j@tcs.com> was undeliverable.
The reason for the problem:
5.1.0 - Unknown address error 550-'vipul4.j@tcs.com... No such user'

The IP address of the MTA to which the message could not be sent:
172.17.9.35

---------- A copy of the message begins below this line ----------
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0CGBgCEyedW/3zwO4tdHgEBAg4BgklMUm2nXoJekBMBDYFmBxUFAQ2HGwI4FAEBAQEBAQFkJ4RLIAoTAQEECCwGSQMBCQICMTsFHASHJ10FCatgZ4RBAQSLKQaBD4REgkIBhlERAWqCNBOBJ5MJhEuBLwKEPogSAoIuhnOFYo1YgUUBAUKBNgwBgj5OB4kqgTIBAQE
X-IPAS-Result: A0CGBgCEyedW/3zwO4tdHgEBAg4BgklMUm2nXoJekBMBDYFmBxUFAQ2HGwI4FAEBAQEBAQFkJ4RLIAoTAQEECCwGSQMBCQICMTsFHASHJ10FCatgZ4RBAQSLKQaBD4REgkIBhlERAWqCNBOBJ5MJhEuBLwKEPogSAoIuhnOFYo1YgUUBAUKBNgwBgj5OB4kqgTIBAQE
X-IronPort-AV: E=Sophos;i="5.24,338,1454956200";
   d="scan'208,217";a="72486315"
X-Amp-Result: Clean
X-Amp-File-Uploaded: False
Received: from smtp.mydomain.com ([139.59.240.124])
  by inmumg01.tcs.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 15 Mar 2016 14:09:37 +0530
Received: from 128.199.202.14 (unknown [128.199.202.14])
    (Authenticated sender: mailsender)
    by smtp.mydomain.com (Postfix) with ESMTPA id 0D41F3FE11
    for <vipul4.j@tcs.com>; Tue, 15 Mar 2016 04:39:33 -0400 (EDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kitemailer.com;
    s=kitemail; t=1458031173;
    bh=1KFZSL77mNYuQ3iTjpNMdGcBOp2a4pGQnLVlq49ZrGg=;
    h=Date:From:To:Subject:List-Unsubscribe;
    b=Xxaf++WE0B7HL+FN28O76df7gYNEIKzk8eE9VpxrnMBCpGWPKWBMMfVDfCyie3NBJ
     GJiMxn/Yhn+ey6Mr5R5AK5JO5n72yWlytLm0RepMEydaeHHVQPx7bE+LMDMlORSFin
     bWdnz58lNMuZ3w9qtqjCXt22Sk5yXfCO71tRgfus=
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=mydomain.com;
    s=kitemail; t=1458031173;
    bh=1KFZSL77mNYuQ3iTjpNMdGcBOp2a4pGQnLVlq49ZrGg=;
    h=Date:From:To:Subject:List-Unsubscribe;
    b=FiGdkSE9LCjYkfYyWq65GbZoMZVCQs5OXXJA35CyGQtjPWbvwIKvx7Z6Ff39EBRLf
     Vu+6PUrvwyZLFh/1CW0NGOHDgUDjWWQ2jHfnNpJ9QEbHgOwomuMty10HDeZnIr0zM7
     8mFCgeCbiiyusQkhmXh5aYqqD+Q/1wFcrpLpkBZc=
Date: Tue, 15 Mar 2016 04:39:31 -0400 (EDT)
From: Kitemailer Newsletter <info@kitemailer.com>
To: vipul4.j@tcs.com
Message-ID: <15106466-1b13-4d64-b220-15f05f4815b7-1458031171312@smtp.mydomain.com>
Subject: KiteMailer | New Features this Week
MIME-Version: 1.0
Content-Type: multipart/mixed;
    boundary="----=_Part_44_1398250960.1458031171306"
List-Unsubscribe: <http://example.com/unsubscribe/dmlwdWw0LmpAdGNzLmNvbSM5Ng==>
Feedback-ID: 19:96:1520615:MyDomain'''

我正在使用您的代码来获取有效负载:

#in case it is not multipart
import email

mail = email.message_from_string(raw_message)
payload = mail.get_payload(decode=True)

mail_dico = { elt.split(":",1)[0].strip():elt.split(":", 1)[1].strip() for elt in payload.split("\n") if ":" in elt and " " not in elt.split(':')[0].strip()}

这是你的字典:

{'Content-Type': 'multipart/mixed;',
 'DKIM-Signature': 'v=1; a=rsa-sha256; c=relaxed/simple; d=mydomain.com;',
 'Date': 'Tue, 15 Mar 2016 04',
 'Feedback-ID': '19',
 'From': 'Kitemailer Newsletter <info@kitemailer.com>',
 'List-Unsubscribe': '<http',
 'MIME-Version': '1.0',
 'Message-ID': '<15106466-1b13-4d64-b220-15f05f4815b7-1458031171312@smtp.mydomain.com>',
 'Received': 'from 128.199.202.14 (unknown [128.199.202.14])',
 'Subject': 'KiteMailer | New Features this Week',
 'To': 'vipul4.j@tcs.com',
 'X-Amp-File-Uploaded': 'False',
 'X-Amp-Result': 'Clean',
 'X-IPAS-Result': 'A0CGBgCEyedW/3zwO4tdHgEBAg4BgklMUm2nXoJekBMBDYFmBxUFAQ2HGwI4FAEBAQEBAQFkJ4RLIAoTAQEECCwGSQMBCQICMTsFHASHJ10FCatgZ4RBAQSLKQaBD4REgkIBhlERAWqCNBOBJ5MJhEuBLwKEPogSAoIuhnOFYo1YgUUBAUKBNgwBgj5OB4kqgTIBAQE',
 'X-IronPort-AV': 'E=Sophos;i="5.24,338,1454956200";',
 'X-IronPort-Anti-Spam-Filtered': 'true',
 'X-IronPort-Anti-Spam-Result': 'A0CGBgCEyedW/3zwO4tdHgEBAg4BgklMUm2nXoJekBMBDYFmBxUFAQ2HGwI4FAEBAQEBAQFkJ4RLIAoTAQEECCwGSQMBCQICMTsFHASHJ10FCatgZ4RBAQSLKQaBD4REgkIBhlERAWqCNBOBJ5MJhEuBLwKEPogSAoIuhnOFYo1YgUUBAUKBNgwBgj5OB4kqgTIBAQE',
 'h=Date': 'From'}

现在您可以访问您的元素:

print(mail_dico["To"])
>> 'vipul4.j@tcs.com'

print(mail_dico["Subject"])
>> 'KiteMailer | New Features this Week'

这可能不是最好的方法,但我希望它有所帮助。