从R中的XML中提取一个值

时间:2014-09-08 14:17:27

标签: xml r geocoding

我正在尝试从某些地理编码XML输出中提取某个值,但是无法完成最后一步。基本上,我将坐标插入到对点进行地理编码的URL中。我尝试了xmlParse,xmlTreeParse,xmlRoot的各种组合,甚至将输出更改为JSON。我只想 块FIPS值。

例如,使用此位置:

url = paste("http://data.fcc.gov/api/block/2000/find?latitude=", "35.8215924013033",
        "&longitude=", "-103.518235473686",
        "&showall=true",
        "&format=xml",sep = "")

我做到了这一点:

doc = xmlParse(url)
root = xmlRoot(doc)

输出

<Block FIPS="350210001001131"/>

那很好,但我只需要数字。我试过把它从上面的输出中剔除,但是我得到一个错误,说它不能被强制转换为字符形式。从长远来看,我将在数百个地点这样做。

我做错了什么?

2 个答案:

答案 0 :(得分:0)

由于API提供了json格式,因此最好使用它来提取数据。

## the site url 
url = "http://data.fcc.gov/api/block/find?format=json"
## lat & long parameters
lat = 35.8215924013033
long = -103.518235473686
## dynamic url using lat and long 
url = paste0(url,"&latitude=",lat,"&longitude=",long,"&showall=true")
## the API call here    
library(RJSONIO)
dc = fromJSON(url)
## extract the FIPS
dc$Block["FIPS"]

"350210001001368"

如果您想使用XML格式:

doc = htmlParse(url)  ## use your url here given in the question
xpathSApply(doc,'//*/block',xmlGetAttr,"fips")
[1] "350210001001368"

答案 1 :(得分:0)

因为Response有一个命名空间:

library(XML)

url = paste("http://data.fcc.gov/api/block/2000/find?latitude=", "35.8215924013033",
            "&longitude=", "-103.518235473686",
            "&showall=true",
            "&format=xml",sep = "")


doc = xmlParse(url)

as.character(xpathApply(doc, "//fcc:Block/@FIPS", namespaces="fcc"))
## [1] "350210001001131"

as.character(xpathApply(doc, "//fcc:County/@FIPS", namespaces="fcc"))
## [1] "35021"

as.character(xpathApply(doc, "//fcc:County/@name", namespaces="fcc"))
## [1] "Harding"

as.character(xpathApply(doc, "//fcc:State/@FIPS", namespaces="fcc"))
## [1] "35"

as.character(xpathApply(doc, "//fcc:State/@code", namespaces="fcc"))
## [1] "NM"

as.character(xpathApply(doc, "//fcc:State/@name", namespaces="fcc"))
## [1] "New Mexico"