匹配所有<g>和<img/>标记,但包含特定字符串的标记除外

时间:2018-04-13 13:55:16

标签: php regex xml

我有一个SVG文件,我想删除所有标签和内部的图像标签,除了一个标签,其中一个标签的href链接包含&#34;艺术品&#34;。我使用php,为方便起见,这里是SVG文件的内容。请注意,SVG文件将删除换行符,以使正则表达式更简单。

到目前为止我的正则表达式是:

(<g transform="(?:.*)?"><\/image><\/g>)  

匹配所有标签和

内的图像标签

&#13;
&#13;
<?xml version="1.0" encoding="UTF-8" standalone="no" ?><!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="100%" height="100%" viewBox="0 0 1879 2053" xml:space="preserve"><desc>Created with Fabric.js 1.6.2</desc>
<defs></defs>
<g transform="translate(939.5 1026.5)">
<image xlink:href="/home/printplusprod/public_html/media/pdp/images/filename1462961406.jpg" x="-939.5" y="-1026.5" style="stroke: none; stroke-width: 0; stroke-dasharray: none; stroke-linecap: butt; stroke-linejoin: miter; stroke-miterlimit: 10; fill: rgb(0,0,0); fill-rule: nonzero; opacity: 1;" width="1879" height="2053" preserveAspectRatio="none"></image>
</g>
<g transform="translate(939.51 1026.5) scale(2.59 2.59)">
<image xlink:href="/home/printplusprod/public_html/media/pdp/images/overlay1462961406.png" x="-362.22" y="-395.76" style="stroke: none; stroke-width: 0; stroke-dasharray: none; stroke-linecap: butt; stroke-linejoin: miter; stroke-miterlimit: 10; fill: rgb(0,0,0); fill-rule: nonzero; opacity: 1;" width="724.44" height="791.52" preserveAspectRatio="none"></image>
</g>
<rect x="-362" y="-395.5" rx="0" ry="0" width="724" height="791" style="stroke: none; stroke-width: 1; stroke-dasharray: none; stroke-linecap: butt; stroke-linejoin: miter; stroke-miterlimit: 10; fill: rgb(0,0,0); fill-opacity: 0; fill-rule: nonzero; opacity: 1;" transform="translate(940.23 1027.12) scale(2.59 2.59)"/>
<g transform="translate(938.93 1025.83) scale(2.59 2.59)">
<image xlink:href="/home/printplusprod/public_html/media/pdp/images/filename1462961406.jpg" x="-362" y="-395.5" style="stroke: none; stroke-width: 0; stroke-dasharray: none; stroke-linecap: butt; stroke-linejoin: miter; stroke-miterlimit: 10; fill: rgb(0,0,0); fill-rule: nonzero; opacity: 1;" width="724" height="791" preserveAspectRatio="xMinYMin slice"></image>
</g>
<g transform="translate(784.5 1177.09) scale(1.5 1.5)">
<image xlink:href="/home/printplusprod/public_html/media/pdp/images/artworks/filename1453713655.jpg" x="-240" y="-179.875" style="stroke: none; stroke-width: 0; stroke-dasharray: none; stroke-linecap: butt; stroke-linejoin: miter; stroke-miterlimit: 10; fill: rgb(0,0,0); fill-rule: nonzero; opacity: 1;" width="480" height="359.75" preserveAspectRatio="none"></image>
</g>
<g transform="translate(938.93 1025.83)">
<image xlink:href="/home/printplusprod/public_html/media/pdp/images/overlay1462961406.png" x="-938.9347705562002" y="-1025.8251429695501" style="stroke: none; stroke-width: 0; stroke-dasharray: none; stroke-linecap: butt; stroke-linejoin: miter; stroke-miterlimit: 10; fill: rgb(0,0,0); fill-rule: nonzero; opacity: 1;" width="1877.8695411124004" height="2051.6502859391003" preserveAspectRatio="none"></image>
</g>
</svg>
&#13;
&#13;
&#13;

1 个答案:

答案 0 :(得分:0)

尽管XPath允许您在XML文档中找到比正则表达式更好的元素,但使用DOMDocument可以轻松解决此问题,并且只需遍历所有<image>个节点。

下面的代码使用getElementsByTagName()查找所有<image>个节点,并检查href属性并检查它是否包含“艺术品”。如果没有,则删除该节点(使用parentNode跟踪从图像标记到<g>节点的备份)。

$xml = new DOMDocument();
$xml->loadXML($data);
$images = $xml->getElementsByTagName("image");
for ( $i = $images->length-1; $i>= 0; $i-- )   {
    $image = $images->item($i);
    if ( strpos($image->attributes->getNamedItem("href")->nodeValue, "artworks") === false )   {
        $g = $image->parentNode;
        $g->parentNode->removeChild($g);
    }
}

这假设$data是SVG的实际内容,如果需要直接从文件中加载它,则需要更改它。

一件看起来奇怪的事情是它实际上向后通过<image>个节点,原因是当您删除节点时,顺序会发生变化,因此删除之前的节点会移动后续节点。反向执行此操作可以解决此问题。

通过上面的例子,结果是......

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1" width="100%" height="100%" viewBox="0 0 1879 2053" xml:space="preserve"><desc>Created with Fabric.js 1.6.2</desc>
<defs/>


<rect x="-362" y="-395.5" rx="0" ry="0" width="724" height="791" style="stroke: none; stroke-width: 1; stroke-dasharray: none; stroke-linecap: butt; stroke-linejoin: miter; stroke-miterlimit: 10; fill: rgb(0,0,0); fill-opacity: 0; fill-rule: nonzero; opacity: 1;" transform="translate(940.23 1027.12) scale(2.59 2.59)"/>

<g transform="translate(784.5 1177.09) scale(1.5 1.5)">
<image xlink:href="/home/printplusprod/public_html/media/pdp/images/artworks/filename1453713655.jpg" x="-240" y="-179.875" style="stroke: none; stroke-width: 0; stroke-dasharray: none; stroke-linecap: butt; stroke-linejoin: miter; stroke-miterlimit: 10; fill: rgb(0,0,0); fill-rule: nonzero; opacity: 1;" width="480" height="359.75" preserveAspectRatio="none"/>
</g>

</svg>