PHP从字符串处理HTML

时间:2011-09-15 15:41:35

标签: php html

我正在从文本编辑器中读取HTML字符串,需要在将其保存到数据库之前对其进行操作。

我所拥有的是这样的:

<h3>Some Text<img src="somelink.jpg" /></h3>

<h3><img src="somelink.jpg" />Some Text</h3>

我需要将其设置为以下格式

<h3>Some Text</h3><div class="img_wrapper"><img src="somelink.jpg" /></div>

这是我提出的解决方案。

$html = '<html><body>' . $field["data"][0] . '</body></html>';

$dom = new DOMDocument();
$dom->loadHTML($html);

$domNodeList = $dom->getElementsByTagName("img");

// Remove Img tags from H3 and place it before the H# tag
foreach ($domNodeList as $domNode) {
    if ($domNode->parentNode->nodeName == "h3") {
        $parentNode = $domNode->parentNode;
        $parentParentNode = $parentNode->parentNode;

        $parentParentNode->insertBefore($domNode, $parentNode->nextSibling);
    }
}

echo $dom->saveHtml();

2 个答案:

答案 0 :(得分:1)

您可能正在寻找preg_replace

// take a search pattern, wrap the image tag matching parts in a tag
// and put the start and ending parts before the wrapped image tag.
// note: this will not match tags that contain > characters within them,
//       and will only handle a single image tag
$output = preg_replace(
    '|(<h3>[^<]*)(<img [^>]+>)([^<]*</h3>)|',
    '$1$3<div class="img_wrapper">$2</div>',
    $input
);

答案 1 :(得分:0)

我用答案更新了问题,但是为了更好的衡量,这里又是答案部分。

$html = '<html><body>' . $field["data"][0] . '</body></html>';

$dom = new DOMDocument();
$dom->loadHTML($html);

$domNodeList = $dom->getElementsByTagName("img");

// Remove Img tags from H3 and place it before the H# tag
foreach ($domNodeList as $domNode) {
    if ($domNode->parentNode->nodeName == "h3") {
        $parentNode = $domNode->parentNode;
        $parentParentNode = $parentNode->parentNode;

        $parentParentNode->insertBefore($domNode, $parentNode->nextSibling);
    }
}

echo $dom->saveHtml();