用文本替换HTML中的所有图像

时间:2013-05-19 19:19:58

标签: php domdocument domxpath

我正在尝试用适当的文本替换某些符合特定要求的HTML中的所有图像。具体要求是它们是“replaceMe”类,图像src文件名是$ myArray。在搜索解决方案时,似乎某种PHP DOM技术是合适的,但是,我对此非常新。例如,给定$ html,我希望返回$ desired_html。在这篇文章的底部是我尝试的实现,目前不起作用。谢谢

$myArray=array(
    'goodImgage1'=>'Replacement for Good Image 1',
    'goodImgage2'=>'Replacement for Good Image 2'
);

$html = '<div>
<p>Random text and an <img src="goodImgage1.png" alt="" class="replaceMe">.  More random text.</p>
<p>Random text and an <img src="goodImgage2.png" alt="" class="replaceMe">.  More random text.</p>
<p>Random text and an <img src="goodImgage2.png" alt="" class="dontReplaceMe">.  More random text.</p>
<p>Random text and an <img src="badImgage1.png"  alt="" class="replaceMe">.  More random text.</p>
</div>';

$desiredHtml = '<div>
<p>Random text and an Replacement for Good Image 1.  More random text.</p>
<p>Random text and an Replacement for Good Image 2.  More random text.</p>
<p>Random text and an <img src="goodImgage2.png" alt="" class="dontReplaceMe">.  More random text.</p>
<p>Random text and an <img src="badImgage1.png"  alt="" class="replaceMe">.  More random text.</p>
</div>';

以下是我试图做的事情..

libxml_use_internal_errors(true);   //Temorarily disable errors resulting from improperly formed HTML
$doc = new DOMDocument();
$doc->loadHTML($html);

//What does this do for me?
$imgs= $doc->getElementsByTagName('img');
foreach ($imgs as $img){}

$xpath = new DOMXPath($doc);
foreach( $xpath->query( '//img') as $img) {
    if(true){   //How do I check class and image name?
        $new = $doc->createTextNode("New Attribute"); 
        $img->parentNode->replaceChild($new,$img);
    }
}

$html=$doc->saveHTML();
libxml_use_internal_errors(false);

2 个答案:

答案 0 :(得分:1)

这样做,你的方式很好:

$myArray=array(
    'goodImgage1.png'=>'Replacement for Good Image 1',
    'goodImgage2.png'=>'Replacement for Good Image 2'
);

$html = '<div>
<p>Random text and an <img src="goodImgage1.png" alt="" class="replaceMe">.  More random text.</p>
<p>Random text and an <img src="goodImgage2.png" alt="" class="replaceMe">.  More random text.</p>
<p>Random text and an <img src="goodImgage2.png" alt="" class="dontReplaceMe">.  More random text.</p>
<p>Random text and an <img src="badImgage1.png"  alt="" class="replaceMe">.  More random text.</p>
</div>';

$classesToReplace = array('replaceMe');

libxml_use_internal_errors(true);   //Temorarily disable errors resulting from improperly formed HTML
$doc = new DOMDocument();
$doc->loadHTML($html);

$xpath = new DOMXPath($doc);
foreach( $xpath->query( '//img') as $img) {
    // get the classes into an array
    $classes = explode(' ', $img->getAttribute('class')); // this will contain the classes assigned to the element
    $classMatches = array_intersect($classes, $classesToReplace);

    // preprocess the image name to match the $myArray keys
    $imageName = $img->getAttribute('src');

    if (isset($myArray[$imageName]) && $classMatches) {   
        $new = $doc->createTextNode($myArray[$imageName]); 
        $img->parentNode->replaceChild($new,$img);
    }
}

echo var_dump($html = $doc->saveHTML());

请注意以下事项:

  • 我检查了具有replaceMe类的图像,可能还有其他类
  • 我将完整的图片文件名添加到$myArray键,基本上是为了简单起见。

答案 1 :(得分:1)

likeitlikeit速度更快。不过,我会发布我的答案,因为它在细节上有一些不同,例如: xpath仅使用相应的<img>属性执行class工作,使用pathinfo获取不带扩展名的文件名。

$doc = new DOMDocument();
$doc->loadHTML($h); // assume HTML in $h

$xpath = new DOMXPath($doc);
$imgs = $xpath->query("//img[@class = 'replaceMe']");

foreach ($imgs as $img) {

    $imgfile = pathinfo($img->getAttribute("src"),PATHINFO_FILENAME);
    if (array_key_exists($imgfile, $myArray)) { 

        $replacement = $doc->createTextNode($myArray[$imgfile]);
        $img->parentNode->replaceChild($replacement, $img); 
    }
}

echo "<pre>" . htmlentities($doc->saveHTML()) . "</pre>";

看到它有效:http://codepad.viper-7.com/11XZt7