XPATH - 当内部节点名称空间不同时,在一个节点下返回整个对象

时间:2013-11-21 10:16:48

标签: php xml xpath namespaces simplexml

另一个XPath问题。我有一个查询API,其中包含多个嵌套命名空间。

<feed xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/" xmlns:media="http://search.yahoo.com/mrss/" xmlns="http://www.w3.org/2005/Atom" xmlns:pamedia="http://paimages.co.uk/pamedia.htm">
<title>
    Image / video search results
</title>
<subtitle>
    Images / video found containing the search string provided
</subtitle>
<pamedia:found>
    2228
</pamedia:found>
<pamedia:perpage>
    100
</pamedia:perpage>
<pamedia:page>
    1
</pamedia:page>
<opensearch:totalResults>
    2228
</opensearch:totalResults>
<opensearch:itemsPerpage>
    100
</opensearch:itemsPerpage>
<opensearch:startIndex>
    1
</opensearch:startIndex>
<id>
    http://images.pressassociation.com/cgi/search_api/?state=search&q=miley
</id>
<link rel="self" href="http://images.pressassociation.com/cgi/search_api/?state=search&q=miley"></link>
<link rel="next" href="http://images.pressassociation.com/cgi/search_api/?state=search&q=miley&offset=2"></link>
<link rel="last" href="http://images.pressassociation.com/cgi/search_api/?state=search&q=miley&offset=23"></link>
<updated>
    2013-11-21T09:13:21Z
</updated>
<link rel="self" href="http://images.pressassociation.com/cgi/search_api/?state=search&q=miley">
    <updated>
        2013-11-21T09:13:21Z
    </updated>
    <name>
        Press Association Images
    </name>
    <email>
        redacted
    </email>
</link>
<entry>
    <pamedia:media-type>
        image/jpeg
    </pamedia:media-type>
    <pamedia:event_date>
        2013-11-14
    </pamedia:event_date>
    <pamedia:urn>
        18209006
    </pamedia:urn>
    <pamedia:domain>
        2
    </pamedia:domain>
    <pamedia:domain_prefix>
        PA
    </pamedia:domain_prefix>
    <link type="application/vnd.iptc.g2.newsitem+xml" href="http://images.pressassociation.com/meta/2.18209006.xml"></link>
    <link rel="related" href="http://images.pressassociation.com/meta/2.18209006.html" type="text/html"></link>
    <link rel="related" href="http://images.pressassociation.com/empicsthumbnail/vol183/block3642/18209006.jpg" type="image/jpeg"></link>
    <media:thumbnail width="127" medium="image" height="190" url="http://images.pressassociation.com/empicsthumbnail/vol183/block3642/18209006.jpg" type="image/jpeg"></media:thumbnail>
    <media:content expression="sample" medium="image" width="511" height="767" url="http://images.pressassociation.com/image/preview/2.18209006.jpg" type="image/jpeg"></media:content>
    <media:copyright>
        Associated Press
    </media:copyright>
    <media:content expression="full" medium="photo" width="6041" height="4024" url="http://images.pressassociation.com/image/2.18209006.jpg" type="image/jpeg"></media:content>
    <updated>
        2013-11-15T10:56:44Z
    </updated>
    <summary type="html">
        Fans wait for singer Miley Cyrus before the Bambi 2013 media awards in Berlin, Germany, Thursday, Nov. 14, 2013. (AP Photo/Gero Breloer)
    </summary>
    <rights type="html">
        UK picture buyers only BRO110
    </rights>
    <id>
        http://images.pressassociation.com/meta/2.18209006.xml
    </id>
    <title type="html">
        2013 Bambi Media Awards - Berlin
    </title>
    <category term="E"></category>
    <author>
        <name>
            Gero Breloer/AP
        </name>
    </author>
</entry>
<entry>
    <pamedia:media-type>
        image/jpeg
    </pamedia:media-type>
    <pamedia:event_date>
        2013-11-14
    </pamedia:event_date>
    <pamedia:urn>
        18207923
    </pamedia:urn>
    <pamedia:domain>
        2
    </pamedia:domain>
    <pamedia:domain_prefix>
        PA
    </pamedia:domain_prefix>
    <link type="application/vnd.iptc.g2.newsitem+xml" href="http://images.pressassociation.com/meta/2.18207923.xml"></link>
    <link rel="related" href="http://images.pressassociation.com/meta/2.18207923.html" type="text/html"></link>
    <link rel="related" href="http://images.pressassociation.com/empicsthumbnail/vol183/block3642/18207923.jpg" type="image/jpeg"></link>
    <media:thumbnail width="127" medium="image" height="198" url="http://images.pressassociation.com/empicsthumbnail/vol183/block3642/18207923.jpg" type="image/jpeg"></media:thumbnail>
    <media:content expression="sample" medium="image" width="511" height="796" url="http://images.pressassociation.com/image/preview/2.18207923.jpg" type="image/jpeg"></media:content>
    <media:copyright>
        Associated Press
    </media:copyright>
    <media:content expression="full" medium="photo" width="2801" height="1796" url="http://images.pressassociation.com/image/2.18207923.jpg" type="image/jpeg"></media:content>
    <updated>
        2013-11-15T10:56:44Z
    </updated>
    <summary type="html">
        Singer Miley Cyrus arrives for the Bambi 2013 media awards in Berlin, Germany, Thursday, Nov. 14, 2013. (AP Photo/Gero Breloer)
    </summary>
    <rights type="html">
        UK picture buyers only BRO105
    </rights>
    <id>
        http://images.pressassociation.com/meta/2.18207923.xml
    </id>
    <title type="html">
        2013 Bambi Media Awards - Berlin
    </title>
    <category term="E"></category>
    <author>
        <name>
            Gero Breloer/AP
        </name>
    </author>
</entry>

为了获取media:namespace节点和其他入口节点,我正在使用此查询:&#39; / namespace:feed / namespace:entry | namespace:entry / media:*&#39; 但是这样做会将媒体节点作为单独的数组项返回:

   rray
(
    [0] => SimpleXMLElement Object
        (
            [link] => Array
                (
                    [0] => SimpleXMLElement Object
                        (
                            [@attributes] => Array
                                (
                                    [href] => http://images.pressassociation.com/meta/2.18209006.xml
                                    [type] => application/vnd.iptc.g2.newsitem+xml
                                )

                        )

                    [1] => SimpleXMLElement Object
                        (


 [@attributes] => Array
                            (
                                [href] => http://images.pressassociation.com/meta/2.18209006.html
                                [rel] => related
                                [type] => text/html
                            )

                    )

                [2] => SimpleXMLElement Object
                    (
                        [@attributes] => Array
                            (
                                [href] => http://images.pressassociation.com/empicsthumbnail/vol183/block3642/18209006.jpg
                                [rel] => related
                                [type] => image/jpeg
                            )

                    )

            )

        [updated] => 2013-11-15T10:56:44Z
        [summary] => Fans wait for singer Miley Cyrus before the Bambi 2013 media awards in Berlin, Germany, Thursday, Nov. 14, 2013. (AP Photo/Gero Breloer)
        [rights] => UK picture buyers only BRO110
        [id] => http://images.pressassociation.com/meta/2.18209006.xml
        [title] => 2013 Bambi Media Awards - Berlin
        [category] => SimpleXMLElement Object
            (
                [@attributes] => Array
                    (
                        [term] => E
                    )

            )

        [author] => SimpleXMLElement Object
            (
                [name] => Gero Breloer/AP
            )

    )

[1] => SimpleXMLElement Object
    (
        [@attributes] => Array
            (
                [url] => http://images.pressassociation.com/empicsthumbnail/vol183/block3642/18209006.jpg
                [medium] => image
                [type] => image/jpeg
                [width] => 127
                [height] => 190
            )

    )

[2] => SimpleXMLElement Object
    (
        [@attributes] => Array
            (
                [url] => http://images.pressassociation.com/image/preview/2.18209006.jpg
                [medium] => image
                [type] => image/jpeg
                [expression] => sample
                [width] => 511
                [height] => 767
            )

    )

[3] => SimpleXMLElement Object
    (
    )

[4] => SimpleXMLElement Object
    (
        [@attributes] => Array
            (
                [url] => http://images.pressassociation.com/image/2.18209006.jpg
                [medium] => photo
                [type] => image/jpeg
                [expression] => full
                [width] => 6041
                [height] => 4024
            )

    )
然而,我需要[1] [2] [3]&amp; [4]与入口节点[0]嵌套以提取值。如果有人可以帮助解决这个问题,我将非常感激。如果可能的话,我希望能够通过一次XPath调用返回所有内容。

1 个答案:

答案 0 :(得分:1)

要获取入口节点,只使用表达式的第一部分:

/atom:feed/atom:entry

要获取具有媒体后代节点的所有入口节点,可以使用以下表达式:

/atom:feed/atom:entry[.//media:*]

获得入口节点后,您需要使用另一个表达式并在query()/ evaluate()的第二个参数中提供上下文。

.//media:*

请注意,您必须在SimpleXml / DOMXpath实例上注册并使用自己的名称空间前缀。

我建议使用DOMXpath :: evaluate()而不是SimpleXmlElement :: xpath()。与xpath()不同,evaluate可以直接返回标量值。像string(.//media:copyright)这样的表达式只能用于evaluate(),而不能用于xpath()或DOMXpath :: query()。

这是一个使用DOM + Xpath查询带有MediaRss的Atom提要的小例子:

$dom = new DOMDocument();
$dom->load('feed.xml');
$xpath = new DOMXpath($dom);

$xpath->registerNamespace('atom', 'http://www.w3.org/2005/Atom');
$xpath->registerNamespace('media', 'http://search.yahoo.com/mrss/');

foreach ($xpath->evaluate('/atom:feed/atom:entry[.//media:thumbnail]', NULL, FALSE) as $entry) {
  $href = $xpath->evaluate('string(atom:link[@rel = "related" and @type="text/html"]/@href)', $entry, FALSE);
  $src = $xpath->evaluate('string(.//media:thumbnail/@url)', $entry, FALSE);
  printf('<a href="%s"><img src="%s"/></a>', $href, $src);
}