如何使用PHP获取XML字符串的子字符串

时间:2017-01-14 13:27:02

标签: php xml xpath

所以我有一个XML字符串:

http://localhost:8888/?purp=oclcn&xml=<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<record xmlns="http://www.loc.gov/MARC21/slim">
    <leader>00000cam a2200000 a 4500</leader>
    <controlfield tag="001">33333502</controlfield>
    <controlfield tag="008">951010s1996    vtua     b    001 0 eng  </controlfield>
    <datafield ind1=" " ind2=" " tag="010">
      <subfield code="a">   95045582 </subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="020">
      <subfield code="a">1858983274</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="020">
      <subfield code="a">9781858983271</subfield>
    </datafield>
    <datafield ind1="0" ind2="0" tag="245">
      <subfield code="a">Economic sociology /</subfield>
      <subfield code="c">edited by Richard Swedberg.</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="260">
      <subfield code="a">Cheltenham, Glos, UK ;</subfield>
      <subfield code="a">Brookfield, Vt., US :</subfield>
      <subfield code="b">E. Elgar Pub. Co.,</subfield>
      <subfield code="c">©1996.</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="300">
      <subfield code="a">xv, 574 pages :</subfield>
      <subfield code="b">illustrations ;</subfield>
      <subfield code="c">25 cm.</subfield>
    </datafield>
    <datafield ind1="1" ind2=" " tag="490">
      <subfield code="a">The international library of critical writings in sociology ;</subfield>
      <subfield code="v">5</subfield>
    </datafield>
    <datafield ind1="1" ind2=" " tag="490">
      <subfield code="a">An Elgar reference collection</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="500">
      <subfield code="a">A collection of journal articles previously published between 1940-1994.</subfield>
    </datafield>
    <datafield ind1=" " ind2="0" tag="650">
      <subfield code="a">Economics</subfield>
      <subfield code="x">Sociological aspects.</subfield>
    </datafield>
    <datafield ind1=" " ind2="0" tag="650">
      <subfield code="a">Sociology.</subfield>
    </datafield>
    <datafield ind1=" " ind2="0" tag="650">
      <subfield code="a">Economics.</subfield>
    </datafield>
    <datafield ind1=" " ind2="6" tag="650">
      <subfield code="a">Économie politique</subfield>
      <subfield code="x">Aspect sociologique.</subfield>
    </datafield>
    <datafield ind1=" " ind2="6" tag="650">
      <subfield code="a">Sociologie.</subfield>
    </datafield>
    <datafield ind1=" " ind2="6" tag="650">
      <subfield code="a">Économie politique.</subfield>
    </datafield>
    <datafield ind1=" " ind2="7" tag="650">
      <subfield code="a">Economics.</subfield>
      <subfield code="2">fast</subfield>
      <subfield code="0">(OCoLC)fst00902116</subfield>
    </datafield>
    <datafield ind1=" " ind2="7" tag="650">
      <subfield code="a">Economics</subfield>
      <subfield code="x">Sociological aspects.</subfield>
      <subfield code="2">fast</subfield>
      <subfield code="0">(OCoLC)fst00902213</subfield>
    </datafield>
    <datafield ind1=" " ind2="7" tag="650">
      <subfield code="a">Sociology.</subfield>
      <subfield code="2">fast</subfield>
      <subfield code="0">(OCoLC)fst01123875</subfield>
    </datafield>
    <datafield ind1="1" ind2="7" tag="650">
      <subfield code="a">Economische sociologie.</subfield>
      <subfield code="2">gtt</subfield>
    </datafield>
    <datafield ind1=" " ind2="7" tag="650">
      <subfield code="a">Sociologie économique.</subfield>
      <subfield code="2">ram</subfield>
    </datafield>
  </record>

如您所见,XML嵌入了元素/标签等。

我想使用Xpath和PHP检索最后一个标记,但仍然将其作为字符串(而不是数组或对象)返回(并且还包括子/子标记)。我该怎么做?

2 个答案:

答案 0 :(得分:0)

请参阅http://php.net/manual/de/domdocument.savexml.php,您可以调用方法$doc->saveXML($node)将DOM节点序列化为字符串。因此,选择DOM元素(或一般的节点),然后在传入所选节点的文档上调用该方法,以获取节点的XML字符串表示形式:

$xml = <<<EOD
<record xmlns="http://www.loc.gov/MARC21/slim">
    <leader>00000cam a2200000 a 4500</leader>
    <controlfield tag="001">33333502</controlfield>
    <controlfield tag="008">951010s1996    vtua     b    001 0 eng  </controlfield>
    <datafield ind1=" " ind2=" " tag="010">
      <subfield code="a">   95045582 </subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="020">
      <subfield code="a">1858983274</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="020">
      <subfield code="a">9781858983271</subfield>
    </datafield>
    <datafield ind1="0" ind2="0" tag="245">
      <subfield code="a">Economic sociology /</subfield>
      <subfield code="c">edited by Richard Swedberg.</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="260">
      <subfield code="a">Cheltenham, Glos, UK ;</subfield>
      <subfield code="a">Brookfield, Vt., US :</subfield>
      <subfield code="b">E. Elgar Pub. Co.,</subfield>
      <subfield code="c">©1996.</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="300">
      <subfield code="a">xv, 574 pages :</subfield>
      <subfield code="b">illustrations ;</subfield>
      <subfield code="c">25 cm.</subfield>
    </datafield>
    <datafield ind1="1" ind2=" " tag="490">
      <subfield code="a">The international library of critical writings in sociology ;</subfield>
      <subfield code="v">5</subfield>
    </datafield>
    <datafield ind1="1" ind2=" " tag="490">
      <subfield code="a">An Elgar reference collection</subfield>
    </datafield>
    <datafield ind1=" " ind2=" " tag="500">
      <subfield code="a">A collection of journal articles previously published between 1940-1994.</subfield>
    </datafield>
    <datafield ind1=" " ind2="0" tag="650">
      <subfield code="a">Economics</subfield>
      <subfield code="x">Sociological aspects.</subfield>
    </datafield>
    <datafield ind1=" " ind2="0" tag="650">
      <subfield code="a">Sociology.</subfield>
    </datafield>
    <datafield ind1=" " ind2="0" tag="650">
      <subfield code="a">Economics.</subfield>
    </datafield>
    <datafield ind1=" " ind2="6" tag="650">
      <subfield code="a">Économie politique</subfield>
      <subfield code="x">Aspect sociologique.</subfield>
    </datafield>
    <datafield ind1=" " ind2="6" tag="650">
      <subfield code="a">Sociologie.</subfield>
    </datafield>
    <datafield ind1=" " ind2="6" tag="650">
      <subfield code="a">Économie politique.</subfield>
    </datafield>
    <datafield ind1=" " ind2="7" tag="650">
      <subfield code="a">Economics.</subfield>
      <subfield code="2">fast</subfield>
      <subfield code="0">(OCoLC)fst00902116</subfield>
    </datafield>
    <datafield ind1=" " ind2="7" tag="650">
      <subfield code="a">Economics</subfield>
      <subfield code="x">Sociological aspects.</subfield>
      <subfield code="2">fast</subfield>
      <subfield code="0">(OCoLC)fst00902213</subfield>
    </datafield>
    <datafield ind1=" " ind2="7" tag="650">
      <subfield code="a">Sociology.</subfield>
      <subfield code="2">fast</subfield>
      <subfield code="0">(OCoLC)fst01123875</subfield>
    </datafield>
    <datafield ind1="1" ind2="7" tag="650">
      <subfield code="a">Economische sociologie.</subfield>
      <subfield code="2">gtt</subfield>
    </datafield>
    <datafield ind1=" " ind2="7" tag="650">
      <subfield code="a">Sociologie économique.</subfield>
      <subfield code="2">ram</subfield>
    </datafield>
  </record>

EOD;

$doc = new DOMDocument();
$doc->loadXML($xml);

$elements = $doc->getElementsByTagNameNS('*', 'datafield');
$lastElement = $elements[$elements->length - 1];

echo $doc->saveXML($lastElement);

使用XPath不会改变有关序列化节点的任何内容,这里是相同的示例,但使用XPath表达式来选择最后一个数据字段元素:

$doc = new DOMDocument();
$doc->loadXML($xml);

$xpath = new DOMXPath($doc);
$xpath->registerNamespace('df', $doc->documentElement->namespaceURI);


$lastElement = $xpath->query('(//df:datafield)[last()]')[0];

echo $doc->saveXML($lastElement);

答案 1 :(得分:0)

据推测,您正在寻找的是

if (($xml = simplexml_load_string($xml_string)) !== FALSE) {
    $xml->registerXPathNamespace('marc', 'http://www.loc.gov/MARC21/slim');

    // Retrieve the last "datafield" element
    $results = $xml->xpath('/marc:record/marc:datafield[last()]');
    if ($results !== FALSE and ($datafield = reset($results)) !== FALSE) {
        // Process the element, or simply output it with:
        echo $datafield->saveXML();
    }
}