从" img"中提取图像名称标记源

时间:2017-06-04 11:17:24

标签: php regex

我有html页面的图像如:

<img src="media/lib/pics/1495343165.jpg" style="width: 600px; height: 400px; margin: 5px;" />

我想仅提取图像名称&#34; 1495343165.jpg&#34;用

替换整个图像标签
<img src="my/new/path/1495343165.jpg"  />

我怎样才能使用正则表达式和php?

由于

2 个答案:

答案 0 :(得分:1)

您可以使用XPath仅定位所需的img节点:

$dom = new DOMDocument;
libxml_use_internal_errors(true);
$dom->loadHTMLFile($filePath, LIBXML_HTML_NODEFDTD);
// or $dom->loadHTML($htmlString, LIBXML_HTML_NODEFDTD);

$xp = new DOMXPath($dom);

$nodeList = $xp->query('//img[starts-with(@src, "media/lib/pics/")]');

$newPath = 'my/new/path/';

foreach ($nodeList as $node) {
    $imgFileName = basename($node->getAttribute('src'));
    $imgNode = $dom->createElement('img'); // create a new img element to replace the old img node
    $imgNode->setAttribute('src', $newPath . $imgFileName);
    $node->parentNode->replaceChild($imgNode, $node);
}

$result = $dom->saveHTML();

XPath查询详情:

//   # everywhere in the DOM tree
img  # an img element
[    # open a predicate
starts-with(@src, "media/lib/pics/") # with a src attribute that starts with "media/lib/pics/"
]    # close the predicate

答案 1 :(得分:0)

您可以使用DOMDocumentbasename()

<?php 
$src = '<img src="media/lib/pics/1495343165.jpg" style="width: 600px; height: 400px; margin: 5px;" />';
$doc = new DOMDocument();
$doc->loadHTML($src);

$src = $doc->getElementsByTagName('img')->item(0)->getAttribute('src');

echo '<img src="my/new/path/'.basename($src).'" />';
//<img src="my/new/path/1495343165.jpg" />
?>