虽然the RFC 2045明确指出引用可打印(QP)中的行不得超过76个字符,但在现实世界中并非每个客户端都遵循此要求。或者我是否误解了RFC的要求?
考虑来自真实世界邮件的以下几行:
<style type=3D"text/css">=0Abody,td { color:#2f2f2f; font:11px/1.35em Verdana, Arial, Helvetica, sans-serif; }=0A</style>=0A<body style=3D"background:#F6F6F6; font-family:Verdana, Arial, Helvetica, sa=
ns-serif; font-size:12px; margin:0; padding:0;">=0D=0A<div style=3D"background:#F6F6F6; font-family:Verdana, Arial, Helvetica, sans-serif; font-size:12px; margin:0; padding:0;">=0D=0A<table cellspacin=
g=3D"0" cellpadding=3D"0" border=3D"0" width=3D"100%">=0D=0A<tr>=0D=0A <td align=3D"center" valign=3D"top" style=3D"padding:20px 0 20px 0">=0D=0A <!-- [ header starts here] -->=0D=0A =
每行是201个字符加上CRLF。但是,有几个=0A
序列转换为LF。那么这是否意味着我需要能够解析这条消息,还是我可以拒绝它?
在我看来,它违反了RFC的以下声明,但我并非100%肯定:
(5) (Soft Line Breaks) The Quoted-Printable encoding
REQUIRES that encoded lines be no more than 76
characters long. If longer lines are to be encoded
with the Quoted-Printable encoding, "soft" line breaks
must be used. An equal sign as the last character on a
encoded line indicates such a non-significant ("soft")
line break in the encoded text.
答案 0 :(得分:2)
尽管最长的行包含128个符号,但您应该能够解析此消息。
此邮件中有=0A
和=SPACE
个序列
=0A
是一个有意义的换行符,=SPACE
是一个软换行符
硬换行应为CRLF(=0D=0A
),但链接的RFC 2045为
也只允许LF(没有CR):
(4) (Line Breaks) A line break in a text body, represented
as a CRLF sequence in the text canonical form, must be
represented by a (RFC 822) line break, which is also a
CRLF sequence, in the Quoted-Printable encoding. (...)
Note that many implementations may elect to encode the
local representation of various content types directly
rather than converting to canonical form first,
encoding, and then converting back to local
representation. In particular, this may apply to plain
text material on systems that use newline conventions
other than a CRLF terminator sequence. Such an
implementation optimization is permissible, but only
when the combined canonicalization-encoding step is
equivalent to performing the three steps separately.