无法在R中将XML元素与XPath匹配

时间:2016-12-22 11:03:30

标签: r xml xpath

我收到以下XML SOAP响应作为来自Web服务的响应

<S:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope" xmlns:S="http://www.w3.org/2003/05/soap-envelope">
   <env:Header/>
   <S:Body>
      <searchResults xsi:schemaLocation="http://eur-lex.europa.eu/search http://eur-lex.europa.eu/eurlex-ws?xsd=3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://eur-lex.europa.eu/search" xmlns:searchLayerHelper="xalan//eu.europa.ec.op.searchlayer.domain.util.SearchLayerHelper">
         <numhits>1</numhits>
         <totalhits>16738</totalhits>
         <page>1</page>
         <language>en</language>
         <result>
            <reference>eng_cellar:9462de2e-80bd-4433-bd97-3465ce7804da_en</reference>
            <rank>1</rank>
            <content>
               <NOTICE>
                  <EXPRESSION>
                     <EXPRESSION_TITLE>
                        <VALUE>Judgment of the Court (Second Chamber) of 8 December 2011. # KME Germany AG, KME France SAS and KME Italy SpA v European Commission. # Appeal - Competition - Agreements, decisions and concerted practices - Market for copper industrial tubes - Fines - Size of the market, duration of the infringement and cooperation capable of being taken into consideration - Effective judicial remedy. # Case C-272/09 P.</VALUE>
                     </EXPRESSION_TITLE>
                     <EXPRESSION_USES_LANGUAGE>
                        <URI>
                           <IDENTIFIER>ENG</IDENTIFIER>
                        </URI>
                     </EXPRESSION_USES_LANGUAGE>
                  </EXPRESSION>
                  <MANIFESTATION>
                     <MANIFESTATION_COURT-REPORT_PART_PAGE_FIRST>
                        <VALUE>00000</VALUE>
                     </MANIFESTATION_COURT-REPORT_PART_PAGE_FIRST>
                     <MANIFESTATION_COURT-REPORT_PART_TYPE>
                        <VALUE>RJ</VALUE>
                     </MANIFESTATION_COURT-REPORT_PART_TYPE>
                     <MANIFESTATION_COURT-REPORT_PART_YEAR>
                        <VALUE>2011</VALUE>
                     </MANIFESTATION_COURT-REPORT_PART_YEAR>
                  </MANIFESTATION>
                  <WORK>
                     <ECLI>
                        <VALUE>ECLI:EU:C:2011:810</VALUE>
                     </ECLI>
                     <ID_CELEX>
                        <VALUE>62009CJ0272</VALUE>
                     </ID_CELEX>
                     <WORK_CREATED_BY_AGENT>
                        <COMPACT_URI>corporate-body:CJ</COMPACT_URI>
                        <IDENTIFIER>CJ</IDENTIFIER>
                        <OP-CODE>CJ</OP-CODE>
                        <PREFLABEL>Court of Justice</PREFLABEL>
                        <URI>
                           <IDENTIFIER>CJ</IDENTIFIER>
                           <TYPE>corporate-body</TYPE>
                           <VALUE>http://publications.europa.eu/resource/authority/corporate-body/CJ</VALUE>
                        </URI>
                     </WORK_CREATED_BY_AGENT>
                     <WORK_DATE_DOCUMENT>
                        <DAY>08</DAY>
                        <MONTH>12</MONTH>
                        <YEAR>2011</YEAR>
                     </WORK_DATE_DOCUMENT>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>CES</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>SPA</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>ENG</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>HRV</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>MLT</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>ELL</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>DAN</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>SWE</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>FRA</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>BUL</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>POL</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>RON</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>LIT</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>SLV</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>NLD</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>FIN</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>DEU</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>LAV</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>POR</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>HUN</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>SLK</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>ITA</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_EXPRESSION>
                        <EMBEDDED_NOTICE>
                           <EXPRESSION>
                              <EXPRESSION_USES_LANGUAGE>
                                 <OP-CODE>EST</OP-CODE>
                              </EXPRESSION_USES_LANGUAGE>
                           </EXPRESSION>
                        </EMBEDDED_NOTICE>
                     </WORK_HAS_EXPRESSION>
                     <WORK_HAS_RESOURCE-TYPE>
                        <IDENTIFIER>JUDG</IDENTIFIER>
                        <OP-CODE>JUDG</OP-CODE>
                        <PREFLABEL>Judgment</PREFLABEL>
                        <URI>
                           <IDENTIFIER>JUDG</IDENTIFIER>
                           <TYPE>resource-type</TYPE>
                           <VALUE>http://publications.europa.eu/resource/authority/resource-type/JUDG</VALUE>
                        </URI>
                     </WORK_HAS_RESOURCE-TYPE>
                  </WORK>
               </NOTICE>
            </content>
         </result>
      </searchResults>
   </S:Body>
</S:Envelope>

我想要做的是使用XPath表达式获取ID_CELEX元素。我正在使用XML库,到目前为止我已经完成了以下操作:

# h is the basicTextGatherer that gets updated after running curlPerform to the webservice
# we can also directly parse the provided XML as string
doc <- xmlTreeParse(h$value(), asText = T, useInternalNodes = T)
r <- xmlRoot(doc)
getNodeSet(r, '//ID_CELEX')
# returns
# list()
# attr(,"class")
# [1] "XMLNodeSet"

似乎有ID_CELEX元素的命名空间,但我不清楚

1 个答案:

答案 0 :(得分:2)

ID_CELEX元素位于http://eur-lex.europa.eu/search命名空间中,因此您应该使用命名空间前缀:

getNodeSet(r, "//r:ID_CELEX", c(r = "http://eur-lex.europa.eu/search"))

输出

[[1]]
<ID_CELEX>
  <VALUE>62009CJ0272</VALUE>
</ID_CELEX>

attr(,"class")
[1] "XMLNodeSet"