正则表达式计算非嵌套<p> </p>标记

时间:2013-04-13 20:35:51

标签: php regex

以下功能与其说的完全相同。它会在找到的第二个段落标记后面的内容中插入一个html字符串。

我需要稍微改变一下,以便它只计算不在其他标签内的段落标签。换句话说,只有顶级段落标记。

用正则表达式做任何事吗?

function my_html_insert($content){
    $InsertAfterParagraph = 2;

    if(substr_count(strtolower($content), '</p>') < $InsertAfterParagraph )
    {
        return $content .= myFunction($my_insert=1);
    }
    else
    {
        $replaced_content = preg_replace_callback('#(<p[\s>].*?</p>\n)#s', 'my_p_callback', $content);
    }
    return $replaced_content;
}


function my_p_callback($matches)
{
    static $count = 0;
    $ret = $matches[1];
    $pCount = get_option('my_p_count');

    if (++$count == $pCount){
        $ret .= myFunction($my_insert=1);
    }

    return $ret;
}

1 个答案:

答案 0 :(得分:3)

我仍在解析它,因为它更清洁,更易于维护:

<?php

$doc = new DOMDocument();
$doc->loadHTML("
    <!DOCTYPE html>
    <html>
        <body>
            <p>Test 1</p>
            <div>Test <p>2</p></div>
            <p>Test <span>3</span></p>
        </body>
    </html>
");
$xpath = new DOMXpath($doc);

$elements = $xpath->query("/html/body/p");

foreach ($elements as $element) {
    $node = $doc->createDocumentFragment();
    $node->appendXML('<h1>This is a test</h1>');

    if ($element->nextSibling) {
        $element->parentNode->insertBefore($node, $element->nextSibling);
    } else {
        $element->parentNode->appendChild($node);
    }
}

echo $doc->saveHTML();

?>

输出:

<!DOCTYPE html>
<html>
    <body>
        <p>Test 1</p><h1>This is a test</h1>
        <div>Test <p>2</p></div>
        <p>Test <span>3</span>t</p><h1>This is a test</h1>
    </body>
</html>