我们已经构建了一个平台,允许用户添加特殊的<#tags#>在HTML输入属性中...我使用preg_replace_callback来查找表单体字符串中的所有匹配输入,然后处理它们并返回整个表单的修改字符串,包括所有更新的输入元素。
我已经将问题缩小到最后一个属性值,从任何一系列字母开始,然后是冒号。这是破坏正则表达式并使其抛出“PREG_BACKTRACK_LIMIT_ERROR”的唯一情况
<input onclick="javascript:blah();">
会破坏它。我已经告诉开发人员他们应该使用onclick =“blah()”,但这曾经有用,浏览器支持它,所以他们仍然希望它能够工作。
<input onclick=":blah();">
没有打破它。这让我觉得它是某种内部存储使用“key:value”对来存储引用或其他东西,而它解析的数据本身正在破坏该数据模式。
一个真正奇怪的是,代码在谷歌应用引擎PHP上产生了不同的结果,并且在运行于centos的PHP 5.3.3上...本机PHP在更多情况下引发错误。
这里是测试代码和测试结果:
<?php
process_string("<input type=\"button\" value=\"update google doc\" onclick=\"javascript:getgoogledoc();\">");
process_string("<input type=\"button\" value=\"update google doc\" onclick=\":getgoogledoc();\">");
process_string("<input type=\"button\" value=\"update google doc\" onclick=\"getgoogledoc();\">");
process_string("<input type=\"button\" value=\"update google doc\" onclick=\"getgoogledoc();\" newattribute=\"javascript:test();\">");
process_string("<input type=\"button\" value=\"update google doc\" onclick=\"a:getgoogledoc();\">");
process_string("<input type=\"a:button\" value=\"javascript:update google doc\">");
process_string("<input type=\"button\" value=\"javascript:update google doc\" <# this makes it match #> onclick=\"javascript:getgoogledoc();\">");
process_string("<input type=\"button\" value=\"javascript:update google doc\" <# this makes it match #> onclick=\"getgoogledoc();\">");
function process_string($string) {
echo "<p><b>NEW TEST</b><br />initial string:<br />";
echo htmlspecialchars($string);
$string = preg_replace_callback(
'/<\s*input\s+((\s*(\w+)\s*=\s*(\'(\\\\\\\\|\\\\\'|[^\'])*\'|"(\\\\\\\\|\\\\"|[^"])*"|(\w+))|\s*(\w+))*\s*)<#\s*(.*?)\s*#>((\s*(\w+)\s*=\s*(\'(\\\\\\\\|\\\\\'|[^\'])*\'|"(\\\\\\\\|\\\\"|[^"])*"|(\w+))|\s*(\w+))*\s*)(\/\s*|)>/is',
function($matches) {
echo "<br />matched something...";
return $matches[0];
},
$string
);
echo "<br />ok... ran the regex replace callback... string is now:<br />";
echo htmlspecialchars($string);
$last_error = preg_last_error();
echo "<br />the last regex error was: $last_error";
if($last_error==PREG_NO_ERROR) {
echo "<br />that is a PREG_NO_ERROR";
}
if($last_error==PREG_INTERNAL_ERROR) {
echo "<br />that is a PREG_INTERNAL_ERROR";
}
if($last_error==PREG_BACKTRACK_LIMIT_ERROR) {
echo "<br />that is a PREG_BACKTRACK_LIMIT_ERROR";
}
if($last_error==PREG_RECURSION_LIMIT_ERROR) {
echo "<br />that is a PREG_RECURSION_LIMIT_ERROR";
}
if($last_error==PREG_BAD_UTF8_ERROR) {
echo "<br />that is a PREG_BAD_UTF8_ERROR";
}
if($last_error==PREG_BAD_UTF8_OFFSET_ERROR) {
echo "<br />that is a PREG_BAD_UTF8_OFFSET_ERROR";
}
}
?>
结果:
NEW TEST
initial string:
<input type="button" value="update google doc" onclick="javascript:getgoogledoc();">
ok... ran the regex replace callback... string is now:
the last regex error was: 2
that is a PREG_BACKTRACK_LIMIT_ERROR
NEW TEST
initial string:
<input type="button" value="update google doc" onclick=":getgoogledoc();">
ok... ran the regex replace callback... string is now:
<input type="button" value="update google doc" onclick=":getgoogledoc();">
the last regex error was: 0
that is a PREG_NO_ERROR
NEW TEST
initial string:
<input type="button" value="update google doc" onclick="getgoogledoc();">
ok... ran the regex replace callback... string is now:
<input type="button" value="update google doc" onclick="getgoogledoc();">
the last regex error was: 0
that is a PREG_NO_ERROR
NEW TEST
initial string:
<input type="button" value="update google doc" onclick="getgoogledoc();" newattribute="javascript:test();">
ok... ran the regex replace callback... string is now:
the last regex error was: 2
that is a PREG_BACKTRACK_LIMIT_ERROR
NEW TEST
initial string:
<input type="button" value="update google doc" onclick="a:getgoogledoc();">
ok... ran the regex replace callback... string is now:
the last regex error was: 2
that is a PREG_BACKTRACK_LIMIT_ERROR
NEW TEST
initial string:
<input type="a:button" value="javascript:update google doc">
ok... ran the regex replace callback... string is now:
<input type="a:button" value="javascript:update google doc">
the last regex error was: 0
that is a PREG_NO_ERROR
NEW TEST
initial string:
<input type="button" value="javascript:update google doc" <# this makes it match #> onclick="javascript:getgoogledoc();">
matched something...
ok... ran the regex replace callback... string is now:
<input type="button" value="javascript:update google doc" <# this makes it match #> onclick="javascript:getgoogledoc();">
the last regex error was: 0
that is a PREG_NO_ERROR
NEW TEST
initial string:
<input type="button" value="javascript:update google doc" <# this makes it match #> onclick="getgoogledoc();">
matched something...
ok... ran the regex replace callback... string is now:
<input type="button" value="javascript:update google doc" <# this makes it match #> onclick="getgoogledoc();">
the last regex error was: 0
that is a PREG_NO_ERROR
答案 0 :(得分:2)
PREG_BACKTRACK_LIMIT_ERROR
由于过度回溯而发生,可以使用Possessive Quantifiers来处理
尝试对正则表达式进行此修改(注意我在^)指示的位置添加了+
量词 -
'/<\s*input\s+((\s*(\w+)\s*=\s*(\'(\\\\\\\\|\\\\\'|[^\'])*\'|"(\\\\\\\\|\\\\"|[^"])*"|(\w+))|\s*(\w+))*+\s*)<#\s*(.*?)\s*#>((\s*(\w+)\s*=\s*(\'(\\\\\\\\|\\\\\'|[^\'])*\'|"(\\\\\\\\|\\\\"|[^"])*"|(\w+))|\s*(\w+))*\s*)(\/\s*|)>/is'
^