Question

我对 XPath 完全陌生，所以请放轻松；-）

我正在尝试从节点获取内容

XML结构看起来像（简化的OOXML）：

 <w:p>
     <w:r>
         <w:drawing>
             <wp:anchor wp14:editId="3BCCBF8F" wp14:anchorId="1109B0B5" 
             distR="114300" distL="114300" distB="0" distT="0" 
             allowOverlap="1" layoutInCell="1" locked="0" behindDoc="0" 
             relativeHeight="251663360" simplePos="0">
                 <a:graphic a="{url}">
                     <a:graphicData uri="{urli}">
                         <pic:pic xmlns:pic="{uri}">
                             <pic:blipFill>
                                 <a:blip cstate="print" r:embed="rId13"/>
{all closing tag p, r, w etc}

 <w:p>
     <w:r>
         <w:drawing>
             <wp:anchor wp14:editId="3BCCBF8F" wp14:anchorId="1109B0B5" 
             distR="114300" distL="114300" distB="0" distT="0" 
             allowOverlap="1" layoutInCell="1" locked="0" behindDoc="0" 
             relativeHeight="251663360" simplePos="0">
                 <a:graphic a="{url}">
                     <a:graphicData uri="{urli}">
                         <pic:pic xmlns:pic="{uri}">
                             <pic:blipFill>
                                 <a:blip cstate="print" r:embed="rId14"/>
{all closing tag p, r, w etc}

我的代码如下：

$result下面只是带有xml的字符串

$document = new DOMDocument();
$document->loadXML($result);
$xpath = new DOMXpath($document);

$xpath->registerNamespace(
   'word', 'http://schemas.openxmlformats.org/wordprocessingml/2006/main'
                    );

foreach ($xpath->evaluate('//word:drawing//word:anchor') as $index => $node) {
    var_dump($node);
}

我得到一个空节点。我显然做错了。我期待此代码的锚点。

我基本上可以循环抛出每个节点并为每个节点找到子项，但这似乎浪费了XPath ...

类似的东西：

foreach ($xpath->evaluate('//word:drawing') as $index => $node) {
    foreach($xpath->evaluate('*', $node) as $anchornode) {
        var_dump($anchornode);
    } 
}

我真正想做的是获取绘图元素中的r：embed值（rId13和rId14）

我一直在尝试在SO上的其他问题中找到我想要的（有很多）。...如果找到一个，请把这个问题转给我。

Answer 1

wp:anchor位于另一个命名空间中（与w:document不同）。寻找xmlns:wp属性。这是wp前缀的名称空间定义。

您还必须为该名称空间注册一个别名/前缀。

$xpath->registerNamespace(
   'word', 'http://schemas.openxmlformats.org/wordprocessingml/2006/main'
);    
$xpath->registerNamespace(
   'wp', 'urn:???'
);

您的代码为名称空间URI word注册了前缀http://schemas.openxmlformats.org/wordprocessingml/2006/main

这允许Xpath处理器解析Xpath表达式中的前缀。您可以将其读取为：

//word:drawing-> //{http://schemas.openxmlformats.org/wordprocessingml/2006/main}drawing

XML解析器对节点名称执行相同的操作。

<w:drawing/>-> <{http://schemas.openxmlformats.org/wordprocessingml/2006/main}drawing/>

这就是匹配方式。但是，因为这样的东西（对于人类）真的很难阅读，并且会导致XML文件很大，所以使用别名/前缀。您可以在Xpath表达式中使用与文档（w，wp，...）中相同的前缀，但是必须将它们注册到相同的名称空间URI。将前缀视为变量名，使其易于阅读，以便以后可以理解代码。

使用XML PHP中的XPath查找特定元素及其值

1 个答案: