是否可以使用正则表达式组匹配进行计算?
字符串:
(00) Bananas
...
(02) Apples (red ones)
...
(05) Oranges
...
(11) Some Other Fruit
...
如果每行开头的数字之间的差异为3或更小,则删除其中的“...”。所以字符串应该像这样返回:
(00) Bananas
(02) Apples (red ones)
(05) Oranges
...
(11) Some Other Fruit
正则表达式:
$match = '/(*ANYCRLF)\((\d+)\) (.+)$
\.{3}
\((\d+)\) (.+)/m';
现在棘手的部分是如何抓住比赛并添加一些像
这样的条件if($3-$1 >= 3) {
//replace
}
测试:http://codepad.viper-7.com/f6iI4m
谢谢!
答案 0 :(得分:3)
您可以使用preg_replace_callback()
来完成此操作。
$callback = function ($match) {
if ($match[3] <= $match[2] + 3) {
return $match[1];
} else {
return $match[0];
}
};
$newtxt = preg_replace_callback('/(^\((\d+)\).+$)\s+^\.{3}$(?=\s+^\((\d+)\))/m', $callback, $txt);
/(^\((\d+)\).+$)\s+^\.{3}$(?=\s+^\((\d+)\))/m
这是模式:
(^\((\d+)\).+$) # subpattern 1, first line; subpattern 2, the number
\s+^\.{3}$ # newline(s) and second line ("...")
(?=\s+^\((\d+)\)) # lookahead that matches another numbered line
# without consuming it; contains subpattern 3, next number
因此,整个模式的匹配是前两行(即编号行和'...'行)。
如果数字差异大于3,请替换为$match[0]
中的原始文本(即无更改)。如果差异小于或等于3,则仅替换为第一行(在$match1]
中找到)。
答案 1 :(得分:0)
您可以使用preg_replace_callback并使用任何PHP代码返回替换字符串,回调接收捕获。但是,对于您的输出,您必须获得重叠匹配以进行替换:
(00) Bananas
与(02) Apples
- &gt; 2-0=2
替换 (02) Apples
与(05) Oranges
- &gt; 5-2=3
替换 但由于输入的(02) Apples
部分已用于上一场比赛,因此第二次不会被选中。
这是一个基于正则表达式的解决方案,具有前瞻性,归功于Wiseguy:
$s = "(00) Bananas
...
(02) Apples (red ones)
...
(05) Oranges
...
(11) Some Other Fruit
...";
$match = '/(*ANYCRLF)\((\d+)\) (.+)$
\.{3}
(?=\((\d+)\) (.+))/m';
// php5.3 anonymous function syntax
$s = preg_replace_callback($match, function($m){
if ($m[3] - $m[1] <= 3) {
print preg_replace("/[\r\n]+.../", '', $m[0]);
} else {
print $m[0];
}
}, $s);
echo $s;
这是我的第一次采取,基于逻辑“找到点,然后看到上一行/下一行”:
$s = "(00) Bananas
...
(02) Apples (red ones)
...
(05) Oranges
...
(11) Some Other Fruit
...
(18) Some Other Fruit
...
(19) Some Other Fruit
...
";
$s = preg_replace("/[\r\n]{2}/", "\n", $s);
$num_pattern = '/^\((?<num>\d+)\)/';
$dots_removed = 0;
preg_match_all('/\.{3}/', $s, $m, PREG_OFFSET_CAPTURE);
foreach ($m[0] as $i => $dots) {
$offset = $dots[1] - ($dots_removed * 4); // fix offset of changing input
$prev_line_end = $offset - 2; // -2 since the offset is pointing to the first '.', prev char is "\n"
$prev_line_start = $prev_line_end; // start the search for the prev line's start from its end
while ($prev_line_start > 0 && $s[$prev_line_start] != "\n") {
--$prev_line_start;
}
$next_line_start = $offset + strlen($dots[0]) + 1;
$next_line_end = strpos($s, "\n", $next_line_start);
$next_line_end or $next_line_end = strlen($s);
$prev_line = trim(substr($s, $prev_line_start, $prev_line_end - $prev_line_start));
$next_line = trim(substr($s, $next_line_start, $next_line_end - $next_line_start));
if (!$next_line) {
break;
}
// get the numbers
preg_match($num_pattern, $prev_line, $prev);
preg_match($num_pattern, $next_line, $next);
if (intval($next['num']) - intval($prev['num']) <= 3) {
// delete the "..." line
$s = substr_replace($s, '', $offset-1, strlen($dots[0]) + 1);
++$dots_removed;
}
}
print $s;