PHP的SimpleXML不保持不同元素类型之间的顺序

时间:2014-05-04 21:10:18

标签: php xml xml-parsing simplexml

据我所知,当你在XML文档树中的同一级别有多种类型的元素时,PHP的SimpleXML,包括SimpleXMLElementSimpleXMLIterator两者都不保持元素的顺序,因为它们彼此相关,只在每个元素中。

例如,请考虑以下结构:

<catalog>
    <book>
        <title>Harry Potter and the Chamber of Secrets</title>
        <author>J.K. Rowling</author>
    </book>
    <book>
        <title>Great Expectations</title>
        <author>Charles Dickens</author>
    </book>
</catalog>

如果我有这个结构并使用SimpleXMLIteratorSimpleXMLElement来解析它,我最终会得到一个看起来像这样的数组:

Array (
    [book] => Array (
        [0] => Array (
            [title] => Array (
                [0] => Harry Potter and the Chamber of Secrets
            )
            [author] => Array (
                [0] => J.K. Rowling
            )
        )
        [1] => Array (
            [title] => Array (
                [0] => Great Expectations
            )
            [author] => Array (
                [0] => Charles Dickens
            )
        )
    )
)

这没关系,因为我只有书籍元素,并且在这些元素中保持正确的顺序。但是,我也说我添加了电影元素:

<catalog>
    <book>
        <title>Harry Potter and the Chamber of Secrets</title>
        <author>J.K. Rowling</author>
    </book>
    <movie>
        <title>The Dark Knight</title>
        <director>Christopher Nolan</director>
    </movie>
    <book>
        <title>Great Expectations</title>
        <author>Charles Dickens</author>
    </book>
    <movie>
        <title>Avatar</title>
        <director>Christopher Nolan</director>
    </movie>
</catalog>

使用SimpleXMLIteratorSimpleXMLElement进行解析会产生以下数组:

Array (
    [book] => Array (
        [0] => Array (
            [title] => Array (
                [0] => Harry Potter and the Chamber of Secrets
            )
            [author] => Array (
                [0] => J.K. Rowling
            )
        )
        [1] => Array (
            [title] => Array (
                [0] => Great Expectations
            )
            [author] => Array (
                [0] => Charles Dickens
            )
        )
    )
    [movie] => Array (
        [0] => Array (
            [title] => Array (
                [0] => The Dark Knight
            )
            [director] => Array (
                [0] => Christopher Nolan
            )
        )
        [1] => Array (
            [title] => Array (
                [0] => Avatar
            )
            [director] => Array (
                [0] => James Cameron
            )
        )
    )
)

因为它以这种方式表示数据,所以我似乎无法确定XML文件中的书籍和电影的顺序实际上是book, movie, book, movie。它只是将它们分为两类(尽管它保持每个类别中的顺序)。

是否有人知道某种解决方法,或者不具备此行为的其他XML解析器?

2 个答案:

答案 0 :(得分:6)

&#34;如果我...使用SimpleXMLIterator或SimpleXMLElement来解析它,我最终会得到一个数组&#34; - 不,你不会,你会得到一个对象,它在某些方面恰好像一个数组。

该对象的递归转储的输出与迭代它的结果相同。

特别是,运行foreach( $some_node->children() as $child_node )将按照文档中显示的顺序为您提供节点的所有子节点,而不考虑名称,如this live code demo中所示。

代码:

$xml = <<<EOF
<catalog>
    <book>
        <title>Harry Potter and the Chamber of Secrets</title>
        <author>J.K. Rowling</author>
    </book>
    <movie>
        <title>The Dark Knight</title>
        <director>Christopher Nolan</director>
    </movie>
    <book>
        <title>Great Expectations</title>
        <author>Charles Dickens</author>
    </book>
    <movie>
        <title>Avatar</title>
        <director>Christopher Nolan</director>
    </movie>
</catalog>
EOF;

$sx = simplexml_load_string($xml);
foreach ( $sx->children() as $node )
{
    echo $node->getName(), '<br />';
}

输出:

book
movie
book
movie

答案 1 :(得分:1)

您可以使用订单注释:

@Root(name="Person")
@Order(elements={"first", "second", "third"})
public class Person {
    private String first;
    private String second;
    private String third;
}

http://simple.sourceforge.net/download/stream/doc/tutorial/tutorial.php#deserialize