
时间:2011-07-06 10:26:54

标签: php xpath domdocument




<p>Intro Text</p>
   <li>List point 1</li>
   <li>List point 2</li>
<p>Some text before an image. 
   <img alt="Slide 1" src="/files/slide1.png" /> 
   Maybe some text in between, nobody knows what the scientists are up to. 
   <img alt="Slide 2" src="/files/slide2.png" /> 
   And even more text right after that.

我想要做的是从<p>代码中获取图片,并在浮动<div> s中的相应段落之前添加它们:

<p>Intro Text</p>
   <li>List point 1</li>
   <li>List point 2</li>
<div class="custom">
   <a href="/files/fullview/slide1.png" rel="lightbox[group][Slide 1]">
      <img src="/files/thumbs/files/slide1.png" />
<div class="custom">
   <a href="/files/fullview/slide2.png" rel="lightbox[group][Slide 2]">
      <img src="/files/thumbs/files/slide2.png" />
<p>Some text before an image. 
   Maybe some text in between, nobody knows what the scientists are up to. 
   And even more text right after that.

所以我需要做的是获取编辑器生成的html的所有图像节点,处理它们,插入div并删除图像节点。 在阅读了很多类似的问题之后,我遗漏了一些东西而无法让它发挥作用。可能,我仍然误解了DOM操作背后的整个概念。 以下是我现在提出的内容:

// create DOMDocument
$doc = new DOMDocument();
// load WYSIWYG html into DOMDocument
// create DOMXpath
$xpath = new DOMXpath($doc);
// create list of all first level DOMNodes (these are p's or ul's in most cases)
$children = $xpath->query("/");
foreach ( $children AS $child ) {
    // now get all images
    $cpath = new DOMXpath($child);
    $images = $cpath->query('//img');
    foreach ( $images AS $img ) {
        // get attributes
        $atts = $img->attributes;
        // create replacement
        $lb_div = $doc->createElement('div');
        $lb_a = $doc->createElement('a');
        $lb_img = $doc->createElement('img');
        $lb_img->setAttribute("src", '/files/thumbs'.$atts->src);
        $lb_a->setAttribute("href", '/files/fullview'.$atts->src);
        $lb_a->setAttribute("rel", "lightbox[slide][".$atts->alt."]");
        $lb_div->setAttribute("class", "custom");
        // remove original node


  1. `$ atts`没有填充值。它确实包含正确的属性名称,但缺少值。
  2. 如果我明白这一点,应该在子节点的父节点上调用
  3. `insertBefore`。所以,它应该是`$ child-&gt; parentNode-&gt; insertBefore($ lb_div,$ child);`但是没有定义父节点。
  4. 删除原始img标记不起作用。


提前吃, 保罗

2 个答案:

答案 0 :(得分:2)


$dom = new DOMDocument;
$dom->preserveWhiteSpace = false;
$dom->loadXML("<div>$xhtml</div>"); // we need the div as root element

// find all img elements in paragraphs in the partial body
$xp = new DOMXPath($dom);
foreach ($xp->query('/div/p/img') as $img) {

    $parentNode = $img->parentNode; // store for later
    $parentNode->removeChild($img); // unlink all found img elements

    // create a element
    $a = $dom->createElement('a');
    $a->setAttribute('href', '/files/fullview/' . basename($img->getAttribute('src')));
    $a->setAttribute('rel', sprintf('lightbox[group][%s]', $img->getAttribute('alt')));

    // prepend img src with path to thumbs and remove alt attribute
    $img->setAttribute('href', '/files/thumbs' . $img->getAttribute('src'));
    $img->removeAttribute('alt'); // imo you should keep it for accessibility though

    // create the holding div
    $div = $dom->createElement('div');
    $div->setAttribute('class', 'custom');

    // insert the holding div
    $parentNode->parentNode->insertBefore($div, $parentNode);

$dom->formatOutput = true;
echo $dom->saveXml($dom->documentElement);

答案 1 :(得分:1)


  1. 您正在迭代文档根元素。这只是一个元素,所以拾取其中的所有图像。
  2. 第二个xpath必须是相对于孩子的,所以从.开始。
  3. 如果加载HTML块,DomDocument将在其周围创建缺少的元素,如body。所以你需要为你的xpath查询和输出解决这个问题。
  4. 您访问属性的方式是错误的。有了错误报告,这会给你错误信息。
  5. 只需看看我能够组装的工作代码(Demo)。我留下了一些笔记:

    $html_from_editor = <<<EOD
    <p>Intro Text</p>
       <li>List point 1</li>
       <li>List point 2</li>
    <p>Some text before an image. 
       <img alt="Slide 1" src="/files/slide1.png" /> 
       Maybe some text in between, nobody knows what the scientists are up to. 
       <img alt="Slide 2" src="/files/slide2.png" /> 
       And even more text right after that.
    // create DOMDocument
    $doc = new DOMDocument();
    // load WYSIWYG html into DOMDocument
    // create DOMXpath
    $xpath = new DOMXpath($doc);
    // create list of all first level DOMNodes (these are p's or ul's in most cases)
    # NOTE: this is XHTML now
    $children = $xpath->query("/html/body/p");
    foreach ( $children AS $child ) {
        // now get all images
        $cpath = new DOMXpath($doc);
        $images = $cpath->query('.//img', $child); # NOTE relative to $child, mind the .
        // if no images are found, continue
        if (!$images->length) continue;
        // insert replacement node
        $lb_div = $doc->createElement('div');
        $lb_div->setAttribute("class", "custom");
        $lb_div = $child->parentNode->insertBefore($lb_div, $child);
        foreach ( $images AS $img ) {
            // get attributes
            $atts = $img->attributes;
            $atts = (object) iterator_to_array($atts); // make $atts more accessible    
            // create the new link with lighbox and full view
            $lb_a = $doc->createElement('a');
            $lb_a->setAttribute("href", '/files/fullview'.$atts->src->value);
            $lb_a->setAttribute("rel", "lightbox[slide][".$atts->alt->value."]");
            // create the new image tag for thumbnail
            $lb_img = $img->cloneNode(); # NOTE clone instead of creating new
            $lb_img->setAttribute("src", '/files/thumbs'.$atts->src->value);
            // bring the new nodes together and insert them
            // remove the original image
    // get body content (original content)
    $result = '';
    foreach ($xpath->query("/html/body/*") as $child) {
        $result .= $doc->saveXML($child); # NOTE or saveHtml 
    echo $result;