删除页脚的procmail配方

时间:2012-05-24 09:10:43

标签: linux email procmail

我在做procmail配方时遇到了一些问题。

到目前为止我得到了什么:

  :0
     * ^X-Loop: myemail@gmail\.com
     /dev/null

     :0

    # filtering email by number 60
     * ^Subject:.*(60)
    {
      :0c:
      ${DEFAULT}

      #trying to take out input from the body
      :0fb
      | head -10

      #Forward it to the other folder
      :0
      mytest/
      }

当procmail读取电子邮件正文时会出现问题。它会显示如下输出:

   +96szV6aBDlD/F7vuiK8fUYVknMQPfPmPNikB+fdYLvbwsv9duz6HQaDuwhGn6dh9w2U
   1sABcykpdyfWqWhLt5RzCqppYr5I4yCmB1CNOKwhlzI/w8Sx1QTzGT32G/ERTlbr91BM VmNQ==
   MIME-Version: 1.0
   Received: by 10.52.97.41 with SMTP id dx9mr14500007vdb.89.1337845760664; Thu,
   24 May 2012 00:49:20 -0700 (PDT)
   Received: by 10.52.34.75 with HTTP; Thu, 24 May 2012 00:49:20 -0700 (PDT)
   Date: Thu, 24 May 2012 15:49:20 +0800
   Message-ID: <CAE1Fe-r4Lid+YSgFTQdpsniE_wzeGjETWLLJJxat+HK94u1=AQ@mail.gmail.com>
   Subject: 60136379500
   From: my email <my email@gmail.com>
   To: your email <your email@gmail.com>
   Content-Type: multipart/alternative; boundary=20cf307f380654240604c0c37d07

   --20cf307f380654240604c0c37d07
   Content-Type: text/plain; charset=ISO-8859-1

   hi
   there
   how
   are
   you

   --20cf307f380654240604c0c37d07
   +96szV6aBDlD/F7vuiK8fUYVknMQPfPmPNikB+fdYLvbwsv9duz6HQaDuwhGn6dh9w2U
   1sABcykpdyfWqWhLt5RzCqppYr5I4yCmB1CNOKwhlzI/w8Sx1QTzGT32G/ERTlbr91BM VmNQ==

我已经设法获得输出,但是如果发送者发送少于3行,则它不起作用,因为输出也会打印出电子邮件的页脚(因为它在-10的范围之间)。 / p>

我只希望在procmail中过滤(在文本文件中打印)电子邮件的正文。 有可能吗?有人可以指路吗?我在我的智慧结束。谢谢

2 个答案:

答案 0 :(得分:1)

尝试将MIME多部分视为一块文本充满了危险。为了正确处理正文,您应该使用MIME感知工具。但是如果你只是想假设第一部分是文本部分并删除所有其他部分,那么你可以创建一些相当简单和健壮的东西。

# Truncate everything after first body part:
# Change second occurrence of --$MATCH to --$MATCH--
# and trim anything after it
:0fb
* ^Content-type: multipart/[a-z]+; boundary="\/[^"]+
| sed -e "1,/^--$MATCH$/b" -e "/^--$MATCH$/!b" -e 's//&--/' -eq

对于优雅点,你可以开发脚本同时实现你的10行身体截断动作,但至少,这应该可以让你开始。 (此时我会切换到awk或Perl。)

:0fb
* ^Content-type: multipart/[a-z]+; boundary="\/[^"]+
| awk -v "b=--$MATCH" ' \
    ($0 == b || $0 == b "--") && seen++ { printf "%s--\n", $0; exit } \
    !seen || p++ < 10'

正确地,MIME部分的标题不应计入行数。

这有点推测;我认为“页脚”是指第一个身体部位之后丑陋的base64编码附件,当然,这个配方对于单部分消息根本不做任何事情。也许你想回到原来的配方那里。

答案 1 :(得分:0)

最近遇到了类似的问题并用此解决了(适应OP)......

#trying to take out input from the body
:0fb
| sed -n '/^Content-Type/,/^--/ { /^Content-Type/b; /^--/b; p }'

解释:一般形式......

sed -n '/begin/,/end/ { /begin/b; /end/b; p }'

-n:         --> turn printing off
/begin/     --> begin of pattern range (remainder commands only apply inside range)
,/end/      --> , end of sed pattern range
{ /begin/b; --> /b branch causes lines with pattern /begin/ to skip remaining commands
/end/b;     --> (same as above), these lines will skip the upcoming (p)rint command
p }'        --> prints lines that in pattern that made it to this command