R - 如何使用xmlEventParse()获取XML中的特定属性

时间:2018-02-04 15:18:17

标签: r xml sax

我有一个如下所示的XML:

<?xml version="1.0" encoding="utf-8"?>
<posts>
  <row Id="1" PostTypeId="1" 
       AcceptedAnswerId="15" CreationDate="2010-07-19T19:12:12.510" Score="27" 
       ViewCount="1647" Body="some text;" OwnerUserId="8" 
       LastActivityDate="2010-09-15T21:08:26.077" 
       Title="title" AnswerCount="5" CommentCount="1" FavoriteCount="17" />
[...]

例如,我想让脚本输出&#34; ViewCount:1647&#34;。

library(XML)

fileName <- "test.xml"
startElement = function(name, attrs, .state)  {
  if(name == "row"){
    .state = .state + 1
    #cat("ViewCount: ", xmlAttrs(gg)["ViewCount"]) <- **output result here**
  }
  if(.state == 10){
    cat("Total Row Parsed: ".state , "\n")
  }
  .state
}
gg <- xmlEventParse(fileName, handlers = list(startElement = startElement), state = 0)
print(gg)

一直在寻找互联网,但这些例子太稀疏和复杂。

有没有办法像Python一样获取属性

viewCount = attributes["ViewCount"]? 

非常感谢!

2 个答案:

答案 0 :(得分:4)

您可以使用 XML库 xpathApply 功能在一行中执行相同的操作。

查看此代码。

library(XML)
a <- xmlParse('<posts>
                <row Id="1" PostTypeId="1" 
              AcceptedAnswerId="15" CreationDate="2010-07-19T19:12:12.510" Score="27" 
              ViewCount="1647" Body="some text;" OwnerUserId="8" 
              LastActivityDate="2010-09-15T21:08:26.077" 
              Title="title" AnswerCount="5" CommentCount="1" FavoriteCount="17" /> </posts>')
xpathApply(a,"/posts/row",xmlGetAttr,"ViewCount")[[1]]

答案 1 :(得分:2)

xml2库可以提供比xml库更简单的语法。有一个使用xml2库的替代解决方案:

library(xml2)

doc<-read_xml('<posts>
  <row Id="1" PostTypeId="1" 
              AcceptedAnswerId="15" CreationDate="2010-07-19T19:12:12.510" Score="27" 
              ViewCount="1647" Body="some text;" OwnerUserId="8" 
              LastActivityDate="2010-09-15T21:08:26.077" 
              Title="title" AnswerCount="5" CommentCount="1" FavoriteCount="17" /></posts>')

#find all of the "row" nodes
row<-xml_find_all(doc, "row")

#find attribute of interest in the nodes
xml_attr(row, "ViewCount")