使用perl解析消息头

时间:2016-01-04 15:17:54

标签: perl email parsing data-dumper

我的编程技巧充其量只是中级,之前我没有使用Perl,所以请轻轻回复。

我试图从地址"中提取原始" (不是"信封来自地址")来自入站电子邮件。

我解析通过我服务器上的MailScanner软件的入站电子邮件。如果我写(使用MailScanner的内置消息对象):

my($message) = @_;
MailScanner::Log::InfoLog("from address: @{$message->{headers}}");

我收到以下日志条目(已清理):

Received: from [192.168.12.34] (port=56309 helo=theirserver.theirdomain.tld)    by server.mydomain.tld with esmtp (Exim 4.86)   (envelope-from <sender@theirdomain.tld>)    id 1aG62o-0002ad-Hu     for recipient@mydomain.tld; Mon, 04 Jan 2016 09:23:34 -0500 Received: from 00a657f7.theirserver.theirdomain.tld ([127.0.0.1]:8056 helo=theirserver.theirdomain.tld)     by theirserver.theirdomain.tld with ESMTP id 00PA657MF7;    for <recipient@mydomain.tld>; Mon, 4 Jan 2016 06:22:53 -0800 Date: Mon, 4 Jan 2016 06:22:53 -0800 To: <recipient@mydomain.tld> Message-ID: <70562391089443970564001376171645@theirserver.theirdomain.tld> From: "Sender" <sender@theirdomain.tld> Subject: test Content-Language: en-us MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: multipart/alternative;  boundary="----=Part.960.1818.1451917373"

如果我写(根据MailScanner的作者的建议):

my($message) = @_;
my $from_address = grep /^From:\s+/i, @{$message->{headers}}; 
MailScanner::Log::InfoLog("from address after grep = $from_address "); 

我收到以下日志条目:

from address after grep = 0

不知道如何处理该结果,我尝试通过我在网上找到的MailScanner兼容脚本使用Data :: Dumper并产生以下结果:

> $VAR1 = bless( {
                 'nameinfected' => 0,
                 'otherinfected' => 0,
                 'disarmedtags' => [],
                 'othertypes' => {},
                 'file2entity' => {
                                    '' => bless( {
                                                   'ME_Parts' => [
                                                                   bless( {
                                                                            'ME_Bodyhandle' => bless( {
                                                                                                        'MB_Path' => '/var/spool/MailScanner/incoming/9365/1aG62o-0002ad-Hu/nmsg-9365-3.txt'
                                                                                                      }, 'MIME::Body::File' ),
                                                                            'ME_Parts' => [],
                                                                            'mail_inet_head' => bless( {
                                                                                                         'mail_hdr_foldlen' => 79,
                                                                                                         'mail_hdr_modify' => 0,
                                                                                                         'mail_hdr_list' => [
                                                                                                                              'Content-Transfer-Encoding: 8bit
',
                                                                                                                              'Content-Type: text/plain; charset="UTF-8"
'
                                                                                                                            ],
                                                                                                         'mail_hdr_hash' => {
                                                                                                                              'Content-Type' => [
                                                                                                                                                  \$VAR1->{'file2entity'}{''}{'ME_Parts'}[0]{'mail_inet_head'}{'mail_hdr_list'}[1]
                                                                                                                                                ],
                                                                                                                              'Content-Transfer-Encoding' => [
                                                                                                                                                               \$VAR1->{'file2entity'}{''}{'ME_Parts'}[0]{'mail_inet_head'}{'mail_hdr_list'}[0]
                                                                                                                                                             ]
                                                                                                                            },
                                                                                                         'mail_hdr_mail_from' => 'KEEP',
                                                                                                         'mail_hdr_lengths' => {}
                                                                                                       }, 'MIME::Head' )
                                                                          }, 'MIME::Entity' ),
                                                                   bless( {
                                                                            'ME_Bodyhandle' => bless( {
                                                                                                        'MB_Path' => '/var/spool/MailScanner/incoming/9365/1aG62o-0002ad-Hu/nmsg-9365-42.html'
                                                                                                      }, 'MIME::Body::File' ),
                                                                            'ME_Parts' => [],
                                                                            'mail_inet_head' => bless( {
                                                                                                         'mail_hdr_foldlen' => 79,
                                                                                                         'mail_hdr_modify' => 0,
                                                                                                         'mail_hdr_list' => [
                                                                                                                              'Content-Transfer-Encoding: 8bit
',
                                                                                                                              'Content-Type: text/html; charset="UTF-8"
'
                                                                                                                            ],
                                                                                                         'mail_hdr_hash' => {
                                                                                                                              'Content-Type' => [
                                                                                                                                                  \$VAR1->{'file2entity'}{''}{'ME_Parts'}[1]{'mail_inet_head'}{'mail_hdr_list'}[1]
                                                                                                                                                ],
                                                                                                                              'Content-Transfer-Encoding' => [
                                                                                                                                                               \$VAR1->{'file2entity'}{''}{'ME_Parts'}[1]{'mail_inet_head'}{'mail_hdr_list'}[0]
                                                                                                                                                             ]
                                                                                                                            },
                                                                                                         'mail_hdr_mail_from' => 'KEEP',
                                                                                                         'mail_hdr_lengths' => {}
                                                                                                       }, 'MIME::Head' )
                                                                          }, 'MIME::Entity' )
                                                                 ],
                                                   'ME_Epilogue' => [
                                                                      '
'
                                                                    ],
                                                   'ME_Preamble' => [],
                                                   'mail_inet_head' => bless( {
                                                                                'mail_hdr_foldlen' => 79,
                                                                                'mail_hdr_modify' => 0,
                                                                                'mail_hdr_list' => [
                                                                                                     'Received: from [192.168.12.34] (port=56309 helo=theirserver.theirdomain.tld)
    by server.mydomain.tld with esmtp (Exim 4.86)
    (envelope-from <sender@theirdomain.tld>)
    id 1aG62o-0002ad-Hu
    for recipient@mydomain.tld; Mon, 04 Jan 2016 09:23:34 -0500
',
                                                                                                     'Received: from 00a657f7.theirserver.theirdomain.tld ([127.0.0.1]:8056 helo=theirserver.theirdomain.tld)
    by theirserver.theirdomain.tld with ESMTP id 00PA657MF7;
    for <recipient@mydomain.tld>; Mon, 4 Jan 2016 06:22:53 -0800
',
                                                                                                     'Date: Mon, 4 Jan 2016 06:22:53 -0800
',
                                                                                                     'To: <recipient@mydomain.tld>
',
                                                                                                     'Message-ID: <70562391089443970564001376171645@theirserver.theirdomain.tld>
',
                                                                                                     'From: "Sender" <sender@theirdomain.tld>
',
                                                                                                     'Subject: Test
',
                                                                                                     'Content-Language: en-us
',
                                                                                                     'MIME-Version: 1.0
',
                                                                                                     'Content-Transfer-Encoding: 8bit
',
                                                                                                     'Content-Type: multipart/alternative;
    boundary="----=Part.960.1818.1451917373"
'
                                                                                                   ],

等等。

所以我接下来尝试使用以下内容解析mail_hdr_list:

my($message) = @_;
MailScanner::Log::InfoLog("SpamWhitelist $msgid: mail_hdr_list @{$message->{headers}}[mail_hdr_list]");

我得到了这个结果:

Received: from server.theirdomain.tld ([192.168.165.54]:49620 helo=server.theirdomain.tld)

我感到困惑。我无法弄清楚如何从此对象获取From:地址,但不知道信封来自地址。

任何帮助重写我的代码都将非常感激。

1 个答案:

答案 0 :(得分:0)

您尝试提取的数据来自Mime :: Entity祝福对象。这意味着当您使用Data :: Dumper或Data :: Dumper :: Perltidy时,您将看到一个应该使用包的操作方法的结构。

根据对Mime :: Head文档的快速阅读,您可能希望在您正在访问的对象上调用 - &gt; get(&#39; From&#39;)等。

查看https://metacpan.org/pod/MIME::Head#Getting-field-contents

希望这有帮助。