Question

我有一个html文件，其中包含一些数据，包括一些url。

仅在这些URL上，我想用空格（通过php文件）替换_字符。

所以这样的网址：

</p><p><a rel="nofollow" class="external text" href="http://10.20.0.30:1234/index.php/this_is_an_example.html">How_to_sample.</a>

将成为

</p><p><a rel="nofollow" class="external text" href="http://10.20.0.30:1234/index.php/this is an example.html">How_to_sample.</a>

这不会影响不在网址上的_。

我认为这可以通过preg_replace来实现，但是我不知道如何进行。

以下代码不正确，因为它替换了每个_，而不仅仅是URL中的一个。

$content2 = preg_replace('/[_]/', ' ', $content);

谢谢。

编辑：

感谢preg_replace_callback的建议，这就是我想要的。

    // search pattern
    $pattern = '/href="http:\/\/10.20.0.30:1234\/index.php\/(.*?).html">/s';

    // the function call
    $content2 = preg_replace_callback($pattern, 'callback', $content);

    // the callback function
    function callback ($m) {
        print_r($m);
        $url = str_replace("_", " ", $m[1]);
        return 'href="http://10.20.0.30:1234/index.php/'.$url.'.html">';
    }

Answer 1

如果您对某些正则表达式技巧不满意，可以单独使用preg_replace()完成任务。

代码：（Demo）

$input = '</p><p><a rel="nofollow" class="external text" href="http://10.20.0.30:1234/index.php/this_is_an_example.html">How_to_sample.</a>';

$pattern = '~(?:\G|\Qhttp://10.20.0.30:1234/index.php\E[^_]+)\K_([^_.]*)~';

echo preg_replace($pattern, " $1", $input);

输出：

</p><p><a rel="nofollow" class="external text" href="http://10.20.0.30:1234/index.php/this is an example.html">How_to_sample.</a>

\G是“继续”元字符。它使您可以在网址的预期部分之后进行多个连续的匹配。

\Q..\E说：“按原样处理两点之间的所有字符-因此无需转义。

\K的意思是“从这一点重新开始全字符串匹配”。

Pattern Demo

由于您正在构建网址，因此我认为您应该将其替换为%20。

我想我的模式应该拒绝\G之后的字符串开头，以获取最佳做法...

$pattern = '~(?:\G(?!^)|\Qhttp://10.20.0.30:1234/index.php\E[^_]+)\K_([^_.]*)~';

preg_replace所有“ _”，仅在网址中用空格

1 个答案: