从XML文件中提取属性

时间:2018-08-28 20:30:32

标签: r xml parsing

我有一个用R解析的示例XML文件

<ROUGHTDRAFT_FILE MV="00" MMV="00" 
    tId="0000">
     <HEADER Location="Utah" dateCreated="1/1/99">
    </HEADER>

    <COVERSHEET>
       <PRIMIARY_INFO eName="John Smith" pList="XXXXX" 
             type="Remodel" cNumber="00000" 
              policyNumber="00000000000"  />
   </COVERSHEET>
</ROUGHDRAFT_FILE>

加载XML并将其命名为文件后,出现错误。这是我的代码:

xml <- xmlParse(file) 

这项工作正常

当我尝试提取属性时,它给我一个错误

EstAttribs <- xpathApply(xml, path="//PRIMIARY_INFO", xml_attrs )

Error in UseMethod("xpathApply") : 
  no applicable method for 'xpathApply' applied to an object of class "c('XMLDocument', 'XMLAbstractDocument')"

有关如何解决此问题的任何建议?我必须为xml_attrs指定一些内容吗?

1 个答案:

答案 0 :(得分:1)

MrFlick已经给您一个答案。这是另一个可能有用的方法。正如他的建议,不要尝试将XML库中的函数与rvestxml2混合使用。

# here is the rvest and xml2 solution
# rvest calls xml2 since it is a dependency
library(rvest)
xml_file <- read_xml("test.xml")

xml_file %>%
  xml_find_all('//PRIMIARY_INFO') %>%
  xml_attrs('eName') 

[[1]]
        eName         pList          type       cNumber  policyNumber 
 "John Smith"       "XXXXX"     "Remodel"       "00000" "00000000000" 

# this solution is purely using XML - as suggested by  MrFlick
library(XML)
xml_file <- xmlParse("test.xml")
xpathApply(xml_file, path="//PRIMIARY_INFO", xmlAttrs )

[[1]]
        eName         pList          type       cNumber  policyNumber 
 "John Smith"       "XXXXX"     "Remodel"       "00000" "00000000000" 

我认为this问题可能包含对您有用的信息。