SimpleXMLElement仅解析xml内容的第一个字段

时间:2014-02-26 08:46:53

标签: php xml

我有点奇怪的问题。我有一些内部网站正在分享与rss Feed类似的内容。我指的是包含XML内容的网站,其中包含一些重要信息。

XML的简单输入(有十几个条目)如下所示:

<?xml version='1.0' encoding='UTF-8'?>
<nvd xmlns:scap-core="http//0.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:patch="http//patch/0.1" xmlns="http//obj/0.1" xmlns:lang="http//lang/2.0" xmlns:cvss="http//cvss-v2/0.2" xmlns:object="http//object/0.4" nvd_xml_version="2.0" pub_date="2014-02-25T10:00:00" xsi:schemaLocation="http//patch/0.1 http//schema/patch_0.1.xsd http//0.1 http//schema/scap-core_0.1.xsd http//obj/0.1 http//schema/nvd-cve-feed_2.0.xsd">
  <entry id="0528">
    <object:configuration id="site.com/">
      <lang:logical-test negate="false" operator="OR">
        <lang:fact-ref name="version:2.6.0"/>
        <lang:fact-ref name="version:2.6.1"/>
        <lang:fact-ref name="version:2.6.2"/>
        <lang:fact-ref name="version:2.6.3"/>
      </lang:logical-test>
    </object:configuration>
    <object:list>
      <object:product>version:2.6.3</object:product>
      <object:product>version:2.6.0</object:product>
      <object:product>version:2.6.1</object:product>
      <object:product>version:2.6.2</object:product>
    </object:list>
    <object:id>0528</object:id>
    <object:published-datetime>2014-02-17T11:55:04.787-05:00</object:published-datetime>
    <object:last-modified-datetime>2014-02-21T09:14:10.780-05:00</object:last-modified-datetime>
    <object:cwe id="264"/>
  </entry>

我想阅读此XML以便将这些值放入我的数据库中。我的方法就是这样:

$ch = curl_init();

   if (FALSE === $ch)
       throw new Exception('failed to initialize');

curl_setopt($ch, CURLOPT_URL,"internal.adres.com");
curl_setopt($ch, CURLOPT_FRESH_CONNECT, TRUE);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

$content = curl_exec($ch);
$xml = new SimpleXMLElement($content);

foreach ($xml as $obj){
    var_dump($obj);
    break;
}

这就是魔术发生的地方。当我执行var_dump($xml)时,我会获得对象列表,但这些对象只有id字段(其他字段如productdatetime都缺失)

var_dump($obj)的结果如下:

object(SimpleXMLElement)#3 (1) { ["@attributes"]=> array(1) { ["id"]=> string(13) "0528" } } 

如何获取此xml的所有字段?

2 个答案:

答案 0 :(得分:0)

您正在查看<entry>字段的属性。 <entry>

循环<obj>

答案 1 :(得分:0)

简化您提供的XML(删除命名空间,更正closign标记并为其添加标题),我将以下示例放在一起。

它显示了一些可用于访问文档中的属性和节点的不同方法。

简而言之:

  • 属性可以作为节点的数组元素引用。
    • E.g。 $node['id']
  • 可以将子节点作为成员变量引用,然后可以将其作为数组循环。
    • E.g。 $node->subNodeforeach( $node->subNode as $subNode )
  • 您可以将引用链接在一起
    • E.g。 $node->subNode[0]['id']

我希望以下示例对您的结构有意义......

<?

$content = '<?xml version="1.0"?>
<entries>
  <entry id="0528">
    <configuration id="google.com">
      <logical negate="false" operator="OR">
        <fact name="1.0.0"/>
      </logical>
    </configuration>
    <list>
      <product>1.0.0</product>
    </list>
    <id>0528</id>
    <datetime>2014-02-17T11:55:04.787-05:00</datetime>
    <last-modified-datetime>2014-02-21T09:14:10.780-05:00</last-modified-datetime>
  </entry>
</entries>';


$xml = new SimpleXMLElement($content);

foreach ($xml as $entry){

    // attributes can be referenced as array elements
    $entryId = $entry['id'];

    echo( "Entry id is {$entryId}\r\n" );

    // Sub-nodes can be referenced as member variables and looped over
    foreach( $entry->configuration as $configuration ) {

        $configurationId = $configuration['id'];
        echo( "Configuration id is {$configurationId}\r\n" );

        foreach( $configuration->logical as $logical ) {

            // You can string the methods together like this:
            $factName = $logical->fact[0]['name'];
            echo( "Logical fact name = $factName" );
        }

    }

}

?>

根据更新的问题,问题似乎与命名空间有关。

可以剥离这些定义......

虽然这可能不是推荐的方式(注册命名空间可能是要走的路,但它们似乎没有有效的URI,所以这可能不是一个选项)。

我的例子变成:

<?

$content = '<?xml version=\'1.0\' encoding=\'UTF-8\'?>
<nvd xmlns:scap-core="http//0.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:patch="http//patch/0.1" xmlns="http//obj/0.1" xmlns:lang="http//lang/2.0" xmlns:cvss="http//cvss-v2/0.2" xmlns:object="http//object/0.4" nvd_xml_version="2.0" pub_date="2014-02-25T10:00:00" xsi:schemaLocation="http//patch/0.1 http//schema/patch_0.1.xsd http//0.1 http//schema/scap-core_0.1.xsd http//obj/0.1 http//schema/nvd-cve-feed_2.0.xsd">
  <entry id="0528">
    <object:configuration id="site.com/">
      <lang:logical-test negate="false" operator="OR">
        <lang:fact-ref name="version:2.6.0"/>
        <lang:fact-ref name="version:2.6.1"/>
        <lang:fact-ref name="version:2.6.2"/>
        <lang:fact-ref name="version:2.6.3"/>
      </lang:logical-test>
    </object:configuration>
    <object:list>
      <object:product>version:2.6.3</object:product>
      <object:product>version:2.6.0</object:product>
      <object:product>version:2.6.1</object:product>
      <object:product>version:2.6.2</object:product>
    </object:list>
    <object:id>0528</object:id>
    <object:published-datetime>2014-02-17T11:55:04.787-05:00</object:published-datetime>
    <object:last-modified-datetime>2014-02-21T09:14:10.780-05:00</object:last-modified-datetime>
    <object:cwe id="264"/>
  </entry>
</nvd>';


$content = preg_replace('/xmlns[^=]*="[^"]*"/i', '', $content);

// Gets rid of all namespace references
$content = preg_replace('/[a-zA-Z]+:([a-zA-Z]+[\W=>])/', '$1', $content);

$xml = new SimpleXMLElement($content);

foreach ($xml as $entry){

    // attributes can be referenced as array elements
    $entryId = $entry['id'];


    echo( "Entry id is {$entryId}\r\n" );

    // Sub-nodes can be referenced as member variables and looped over
    foreach( $entry->configuration as $configuration ) {

        $configurationId = $configuration['id'];
        echo( "Configuration id is {$configurationId}\r\n" );

        // Note for hyphenated nodes you need to wrap in quotes and curlies
        foreach( $configuration->{'logical-test'} as $logical ) {

            $testOperator = $logical['operator'];
            echo( "Test Operator = $testOperator\r\n" );

            // You can string the methods together like this:
            $factName = $logical->{'fact-ref'}[0]['name'];
            echo( "Logical fact name = $factName\r\n" );
        }

    }

}

?>

哪个输出:

Entry id is 0528
Configuration id is site.com/
Test Operator = OR
Logical fact name = version:2.6.0

请注意,为了访问名称中带有连字符的节点,您需要包含curlies和quotes。例如。 $logical->{'fact-ref'}