xmlTreeParse不识别树

时间:2012-06-17 18:37:11

标签: xml r api xml-parsing

我有一个简单的问题。我正在尝试获取Open Data api上可用的指标列表。我使用RCurl函数getURL来提取http://api.worldbank.org/indicators的内容,然后在生成的xml页面上提取XML函数xmlTreeParse。但xmlTreeParse只是将xml文件视为一个重要的测试块。为什么是这样?谢谢!

library(RCurl)
library(XML)

temp <- getURL("http://api.worldbank.org/indicators)
temp <- xmlTreeParse(temp)

1 个答案:

答案 0 :(得分:2)

您可以使用

temp <- getURL("http://api.worldbank.org/indicators")
temp <- xmlParse(temp)
xpathSApply(temp,"//wb:source") # example access data 1
xpathSApply(temp,"//wb:source[@id=2]") # example access data 2

使用xmlParsexmlTreeParse(useInternalNodes=T)

使用这种简单的结构,您可以转换为数据帧,如下所示

my.df<-xmlToDataFrame(temp)

或列表

my.list<-xmlToList(temp)

> my.list[[1]]
$name
[1] "Agricultural machinery, tractors"

$source
$source$text
[1] "World Development Indicators"

$source$.attrs
 id 
"2" 


$sourceNote
[1] "Agricultural machinery refers to the number of wheel and crawler tractors (excluding garden tractors) in use in agriculture at the end of the calendar year specified or during the first quarter of the following year."

$sourceOrganization
[1] "Food and Agriculture Organization, electronic files and web site."

$topics
$topics$topic
$topics$topic$text
[1] "Agriculture & Rural Development  "

$topics$topic$.attrs
 id 
"1" 



$.attrs
              id 
"AG.AGR.TRAC.NO"