如何从文本中获取完整的Mail :: SpamAssassin :: MailMessage对象?

时间:2010-02-04 12:04:55

标签: perl spamassassin

我使用以下代码使用SpamAssassin生成垃圾邮件报告:

use Mail::SpamAssassin;

my $sa = Mail::SpamAssassin->new();

open FILE, "<", "mail.txt";
my @lines = <FILE>;
my $mail = $sa->parse(@lines);

my $status = $sa->check($mail);

my $report = $status->get_report();
$report =~ s/\n/\n<br>/g;

print "<h1>Spam Report</h1>";
print $report;

$status->finish();
$mail->finish();
$sa->finish();

我遇到的问题是它将'sample-nonspam.txt'归类为垃圾邮件:

Content preview: [...] 

Content analysis details: (6.9 points, 5.0 required) 

pts rule name description 
---- ---------------------- -------------------------------------------------- 
-0.0 NO_RELAYS Informational: message was not relayed via SMTP 
1.2 MISSING_HEADERS Missing To: header 
0.1 MISSING_MID Missing Message-Id: header 
1.8 MISSING_SUBJECT Missing Subject: header 
2.3 EMPTY_MESSAGE Message appears to have no textual parts and no 
Subject: text 
-0.0 NO_RECEIVED Informational: message has no Received headers 
1.4 MISSING_DATE Missing Date: header 
0.0 NO_HEADERS_MESSAGE Message appears to be missing most RFC-822 headers 

该信息在文件中。令我担心的是,在文档中,它指出“Parse将返回一个Mail :: SpamAssassin :: Message对象,只解析了标题。”这是否意味着它不会返回完整的消息?

2 个答案:

答案 0 :(得分:1)

你错过了一个角色:

my $mail = $sa->parse(\@lines);

从文档(重点补充):

  

parse($message, $parse_now [, $suppl_attrib])

     

Parse将返回一个Mail::SpamAssassin::Message对象,只解析标题。调用此函数时,可以传入两个可选参数:$messageundef(将使用STDIN),整个消息的标量, an消息的数组引用,每个数组元素1行,或者包含消息全部内容的文件glob;和$parse_now,指定是否在分析时或之后根据需要创建MIME树。

通过上面的更改,我得到以下输出(HTML剥离):

 pts rule name              description
---- ---------------------- --------------------------------------------------
-2.6 BAYES_00               BODY: Bayesian spam probability is 0 to 1%
                            [score: 0.0000]

正如文档所述,parse是灵活的。你可以改为使用

my $mail = $sa->parse(join "" => <FILE>);  # scalar of the entire message

my $mail = $sa->parse(\*FILE);             # a file glob with the entire contents

my $mail;
{ local $/; $mail = $sa->parse(<FILE>) }   # scalar of the entire message

甚至

open STDIN, "<", "mail.txt" or die "$0: open: $!";
my $mail = $sa->parse(undef);              # undef means read STDIN

您要删除my @lines = <FILE>这四个示例,以便按预期运行。

答案 1 :(得分:0)

这是构建消息的正确方法:

my $mail = Mail::SpamAssassin::Message->new({ "message" => $content });