这是我已经问过a very similar question的后续内容,但这次我试图获取xmlAttrs而不是xmlValue。所以我们假设我们有以下内容:
my.xml <- '
<tv>
<show>
<name>Star Trek TNG</name>
<rating>1.0</rating>
<a href="http://www.google.com">google</a>
</show>
<show>
<name>Doctor Who</name>
<a href="http://www.google.com">google</a>
</show>
<show>
<name>Babylon 5</name>
<rating>2.0</rating>
</show>
</tv>
'
library(XML)
doc <- xmlParse(my.xml)
xpathSApply(doc, '/tv/show', function(x) xmlValue(xmlChildren(x)$a))
# [1] "google" "google" NA
我希望输出为
# [1] "http://www.google.com" "http://www.google.com" NA
但是我无法弄明白。我以为它可能是这样的,但我错了:
xpathSApply(doc, '/tv/show', function(x) xmlAttrs(xmlChildren(x)$a))
# Error in UseMethod("xmlAttrs", node) :
# no applicable method for 'xmlAttrs' applied to an object of class "NULL"
我最接近的是:
xpathSApply(doc, '/tv/show', function(x) xmlChildren(x)$a)
# [[1]]
# <a href="http://wwww.google.com">google</a>
#
# [[2]]
# <a href="http://wwww.google.com">google</a>
#
# [[3]]
# NULL
答案 0 :(得分:5)
几乎得到了它。您只需要自己处理NULL情况,因为xmlAttrs()
在遇到NULL时会给出错误:
> xpathSApply(doc, '/tv/show', function(x) ifelse(is.null(xmlChildren(x)$a), NA, xmlAttrs(xmlChildren(x)$a, 'href')))
[1] "http://www.google.com" "http://www.google.com" NA