Question

我需要找到一种方法来替换所有＆lt; p＆gt;在所有＆lt; blockquote＆gt;中在＆lt; hr /＆gt;之前。

这是一个示例html：

<p>2012/01/03</p>
<blockquote>
    <h4>File name</h4>
    <p>Good Game</p>
</blockquote>
<blockquote><p>Laurie Ipsumam</p></blockquote>
<h4>Some title</h4>
<hr />
<p>Lorem Ipsum</p>
<blockquote><p>Laurel Ipsucandescent</p></blockquote>

这是我得到的：

    $pieces = explode("<hr", $theHTML, 2);
    $blocks = preg_match_all('/<blockquote>(.*?)<\/blockquote>/s', $pieces[0], $blockmatch); 

    if ($blocks) { 
        $t1=$blockmatch[1];
        for ($j=0;$j<$blocks;$j++) {
            $paragraphs = preg_match_all('/<p>/', $t1[$j], $paragraphmatch);
            if ($paragraphs) {
                $t2=$paragraphmatch[0]; 
                for ($k=0;$k<$paragraphs;$k++) { 
                    $t1[$j]=str_replace($t2[$k],'<p class=\"whatever\">',$t1[$j]);
                }
            }
        } 
    }

我认为我非常接近，但我不知道如何将我刚刚拼凑出并修改过的html重新组合在一起。

Answer 1

您可以尝试使用simple_xml或更好的DOMDocument（http://www.php.net/manual/en/class.domdocument.php），然后再将其设为有效的html代码，并使用此功能查找您要查找的节点，以及替换它们，为此您可以尝试XPath（http://w3schools.com/xpath/xpath_syntax.asp）。

编辑1：

看一下这个问题的答案：

RegEx match open tags except XHTML self-contained tags

Answer 2

$string = explode('<hr', $string);
$string[0] = preg_replace('/<blockquote>(.*)<p>(.*)<\/p>(.*)<\/blockquote>/sU', '<blockquote>\1<p class="whatever">\2</p>\3</blockquote>', $string[0]);
$string = $string[0] . '<hr' . $string[1];

输出：

<p>2012/01/03</p>
<blockquote>
    <h4>File name</h4>
    <p class="whatever">Good Game</p>
</blockquote>
<blockquote><p class="whatever">Laurie Ipsumam</p></blockquote>
<h4>Some title</h4>
<hr />
<p>Lorem Ipsum</p>
<blockquote><p>Laurel Ipsucandescent</p></blockquote>

PHP preg_match_all + str_replace

2 个答案: