使用simpleXML解析嵌套命名空间的XML?

时间:2013-02-27 06:19:22

标签: php xml-parsing simplexml

以下是xml文件的内容:

 <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<w:document xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml">
    <w:body>
        <w:p w:rsidR="00546015" w:rsidRDefault="00546015">
            <w:r>
                <w:t xml:space="preserve">Hello </w:t>
            </w:r>
            <w:proofErr w:type="spellStart"/>
            <w:r>
                <w:t>Doctor</w:t>
            </w:r>
            <w:proofErr w:type="spellEnd"/>
            <w:r>
                <w:t>,</w:t>
            </w:r>
        </w:p>
        <w:p w:rsidR="00546015" w:rsidRDefault="00546015" w:rsidP="00B72192">
            <w:r>
                <w:t xml:space="preserve">I hope you are doing well. Thanks for taking the time to speak with us on Skype yesterday. It is always a pleasure talking with you. </w:t>
            </w:r>
        </w:p>
        <w:p w:rsidR="00546015" w:rsidRDefault="00546015"/>
        .
        .
        .
        .
        .
        and this list goes on

这是我的启动代码,但我不确定这是我正在遵循的正确方法还是有更好的方法来实现这一目标?

// load the xml into the object
$xml = simplexml_load_file('word/document.xml');

//Use that namespace
$namespaces = $xml->getNameSpaces(true);

//Now we don't have the URL hard-coded
$w_doc = $xml->children($namespaces['w']);
$document = $w_doc->document;

$w_body = $document->document->children($namespaces['w']);

$body = $w_body->body;

如何遍历元素以获取<w:t>的内容?

1 个答案:

答案 0 :(得分:4)

Xpath可能是最简单的:

// load the xml into the object
$xml = simplexml_load_file('word/document.xml');

//Use that namespace
$namespaces = $xml->getNameSpaces(true);

$xml->registerXPathNamespace('w', $namespaces['w']);

$nodes = $xml->xpath('/w:document/w:body//w:t');

foreach($nodes as $node) {
  echo (string) $node . "\n\n";
}