我们正面临可以使用正则表达式修复的问题:https://github.com/php-mime-mail-parser/php-mime-mail-parser/issues/176
某些电子邮件地址不符合RFC822标准。
特殊字符(例如>
,@
)存在问题,这些字符不在引号"
内且不是电子邮件地址。
以下是输入变体:
Neuman@BBN-TENEXA
Alfred > Neuman <Neuman@BBN-TENEXA>
Alfred > Neuman <Neuman@BBN-TENEXA>, Alfred Neuman <Neuman@BBN-TENEXA>, "Alfred > Neuman" <Neuman@BBN-TENEXA>, Alfred > Neuman <Neuman@BBN-TENEXA>
"Alfred > Neuman" <Neuman@BBN-TENEXA>
Alfred @ Neuman <Neuman@BBN-TENEXA>
这是必需的输出:
Neuman@BBN-TENEXA
"Alfred > Neuman" <Neuman@BBN-TENEXA>
"Alfred > Neuman" <Neuman@BBN-TENEXA>, Alfred Neuman <Neuman@BBN-TENEXA>, "Alfred > Neuman" <Neuman@BBN-TENEXA>, "Alfred > Neuman" <Neuman@BBN-TENEXA>
"Alfred > Neuman" <Neuman@BBN-TENEXA>
"Alfred @ Neuman" <Neuman@BBN-TENEXA>
任何人都可以帮忙创建这样的替换reg exp吗?
答案 0 :(得分:1)
正则表达式:".*?"(*SKIP)(*FAIL)|(\w+\s[<>@]\s\w+)
替换:"$1"
或者如果你想更准确地使用:
"\w+\s[<>@]\s\w+"(*SKIP)(*FAIL)|(\w+\s[<>@]\s\w+)
"Alfred\s[<>@]\sNeuman"(*SKIP)(*FAIL)|(Alfred\s[<>@]\sNeuman)
PHP代码:
$text = 'Neuman@BBN-TENEXA
Alfred > Neuman <Neuman@BBN-TENEXA>
Alfred > Neuman <Neuman@BBN-TENEXA>, Alfred Neuman <Neuman@BBN-TENEXA>, "Alfred > Neuman" <Neuman@BBN-TENEXA>, Alfred > Neuman <Neuman@BBN-TENEXA>
"Alfred > Neuman" <Neuman@BBN-TENEXA>
Alfred @ Neuman <Neuman@BBN-TENEXA>';
$text = preg_replace("/\".*?\"(*SKIP)(*FAIL)|(\w+\s[<>@]\s\w+)/", "\"$1\"", $text);
print_r($text);
<强>输出强>:
Neuman@BBN-TENEXA
"Alfred > Neuman" <Neuman@BBN-TENEXA>
"Alfred > Neuman" <Neuman@BBN-TENEXA>, Alfred Neuman <Neuman@BBN-TENEXA>, "Alfred > Neuman" <Neuman@BBN-TENEXA>, "Alfred > Neuman" <Neuman@BBN-TENEXA>
"Alfred > Neuman" <Neuman@BBN-TENEXA>
"Alfred @ Neuman" <Neuman@BBN-TENEXA>