xpath-> query()仅适用于星号

时间:2014-12-15 16:03:04

标签: php xml excel xpath

以下是我目前正在使用的代码。

输入XML文件可在此处获取:http://pastebin.com/hcQhPSjs

header("Content-Type: text/plain");
  $xmlFile = new domdocument();
  $xmlFile->preserveWhiteSpace = false;
  $xmlFile->load("file:///srv/http/nginx/html/xml/UNSD_Quest_Sample.xml");
  $xpath = new domxpath($xmlFile);
  $hier = '//Workbook';
  $result = $xpath->query($hier);
  foreach ($result as $element) {
    print $element->nodeValue;
    print "\n";
  };

现在对于$hier变量,PHP不会解析结果,除非我使用通配符*来到达我需要的节点。因此,我没有使用通常的/Workbook/Worksheet/Table/Row/Cell/Data方法来访问节点,而是将其降级为/*/*[6]/*[2]/*输入文件是导出到xml的Excel电子表格。似乎问题可能出在从xls到xml的导出中。

我发现奇怪的是,Firefox(默认浏览器)在Chromium和/或任何文本编辑器执行时都不解析根元素<Workbook>的命名空间属性。
Firefox:

<?mso-application progid="Excel.Sheet"?>
<Workbook>
<DocumentProperties>
<Author>Htike Htike Kyaw Soe</Author>
<Created>2014-01-14T20:37:41Z</Created>
<LastSaved>2014-12-04T10:05:11Z</LastSaved>
<Version>14.00</Version>
</DocumentProperties>
<OfficeDocumentSettings>
<AllowPNG/>
</OfficeDocumentSettings>

Chromium:

<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:html="http://www.w3.org/TR/REC-html40">
<DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
<Author>Htike Htike Kyaw Soe</Author>
<Created>2014-01-14T20:37:41Z</Created>
<LastSaved>2014-12-04T10:05:11Z</LastSaved>
<Version>14.00</Version>
</DocumentProperties>
<OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office">
<AllowPNG/>
</OfficeDocumentSettings>  

有人可以解释为什么会这样吗?

1 个答案:

答案 0 :(得分:1)

您需要为XML中使用的命名空间注册和使用命名空间前缀。从标签和元素名称我希望它是urn:schemas-microsoft-com:office:spreadsheet - Excel电子表格。所以这是一个例子:

$xml = <<<'XML'
<?xml version="1.0"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet">
  <Worksheet>
    <Table>
      <Row>
        <Cell>
          <Data>TEST</Data>
        </Cell>
      </Row>
    </Table>
  </Worksheet>
</Workbook>
XML;

$dom = new DOMDocument();
$dom->preserveWhiteSpace = false;
$dom->loadXML($xml);
$xpath = new DOMXpath($dom);
$xpath->registerNamespace('s', 'urn:schemas-microsoft-com:office:spreadsheet');

$expression = '/s:Workbook/s:Worksheet/s:Table/s:Row/s:Cell/s:Data';
$result = $xpath->evaluate($expression);
foreach ($result as $element) {
  print $element->nodeValue;
  print "\n";
}

输出:

TEST

您不应使用DOMXpath::query(),而应使用DOMXpath::evaluate()。它允许您使用XPath获取标量值。

$expression = 'string(/s:Workbook/s:Worksheet/s:Table/s:Row/s:Cell/s:Data)';
echo $xpath->evaluate($expression);