我正在尝试累积地址,以便将它们绘制在R中的地图上。我手动获取地址并将它们输入到.csv中以导入到R.中.csv的格式如下:< / p>
streetnumber |街道|城市|州
1150 | FM 1960 West Road |休斯顿| TX
701 |凯勒百汇|凯勒| TX
每个标题(街道号,街道,城市和州)都是一个唯一的列,下面的数据分为各自的列。
我让R读取.csv中的信息并将其转换为适合Google Maps API使用的格式。我有API生成一个.xml文件,其中包含与输入的地址相对应的信息。最小的工作示例如下:
streetnumber1<-paste(data$streetnumber,sep="")
street1<-gsub(" ","+",data$street)
street2<-paste(street1,sep="")
city1<-paste(data$city,sep="")
state1<-paste(data$state,sep="")
url<-paste("http://maps.googleapis.com/maps/api/geocode/xml?address="
,streetnumber1,"+",street2,",+",city1,",+",state1,"&sensor=false",sep="")
调用url
会生成两个可以输入网络浏览器的网址,以导航到Google Maps API提供的.xml数据。
我希望在.csv文件中的所有地址都能发生这种情况,而不会声明应该生成url的次数。我觉得这是apply
函数的工作,但我不确定如何去做。一旦我自动化R和API之间的交互,我想解析获得的.xml,以便我可以提取我正在寻找的信息。
答案 0 :(得分:6)
ggmap
包有一个geocode
函数,我强烈推荐使用它,而不是在这里重新发明轮子。
修改:由于您说“多个地址”,您可能更喜欢使用data.frame
方法的my version和内置批量地理编码的一些健壮性检查,并允许使用Bing Maps API(带有每天25K,而不是像谷歌地图那样每天2.5K。
答案 1 :(得分:4)
从这个问题我不清楚你究竟想从谷歌那里获得什么。我假设这是纬度和经度。如果是,请尝试类似屏幕截图后面的代码。编辑:修改为根据Ari B. Friedman的评论使用来自geocode
包的ggmap
函数包含替代(和更简单)方法。
# Read in the text from your example
mydf <- read.csv(con <- textConnection(
"streetnumber|street|city|state
1150|FM 1960 West Road|Houston|TX
701|Keller Parkway|Keller|TX"), header = TRUE, sep = "|", check.names = FALSE)
# APPROACH 1 - works but Approach 2 probably better (see below)
# Create a new column for the URL to pass to Google API
mydf$url <- with(mydf, paste("http://maps.googleapis.com/maps/api/geocode/xml?address=",
streetnumber,
gsub(" ", "+", street),
city, "+",
state, "+",
"&sensor=false",
sep = ""))
# Check to see what we have in the data frame
str(mydf)
library(XML)
latlon <- lapply(mydf$url, function(x) { # process each element in the column 'url'
myxml <- xmlTreeParse(x, useInternal = TRUE) # pass the element (an URL) to the XML function
# parse the result
lat = xpathApply(myxml, '/GeocodeResponse/result/geometry/location/lat', xmlValue)[[1]]
lon = xpathApply(myxml, '/GeocodeResponse/result/geometry/location/lng', xmlValue)[[1]]
data.frame(lat = lat, lon = lon) # return the latitude and longitude as a data frame
})
# We end up with a list of data frames, so merge the data frames into one:
library(reshape)
latlon <- merge_all(latlon)
# Then bolt the columns on to your existing data frame
mydf <- cbind(mydf, latlon, stringsAsFactors = FALSE)
# We want the latitude and longitude to numbers, not characters
mydf$lat <- as.numeric(mydf$lat)
mydf$lon <- as.numeric(mydf$lon)
require(ggmap)
# APPROACH 2 - let ggmap do the heavy lifting (and
# comment out Approach 1 if you use this)
mydf$location <- with(mydf, paste(streetnumber,street, city, state,sep = ", "))
latlon <- geocode(mydf$location)
mydf <- cbind(mydf, latlon, stringsAsFactors = FALSE)
# Now plot.
# Be careful when specifying the zoom argument, because larger values can cause
# points to be dropped by geom_point()
ggmap(get_googlemap(maptype = 'roadmap', zoom = 6, scale = 2), extent = 'panel') +
geom_point(data = mydf, aes(x = lon, y = lat), fill = "red", colour = "black",
size = 3, shape = 21)
答案 2 :(得分:1)
使用Google Mpas API时,最好使用他的JSON API。它不像JSON那样轻量级。
为了保持连续性,我稍微修改了您的原始代码,并使用RJSONIO
包。
## I read your data
dat <- read.table(text = '
streetnumber | street | city | state
1150 | FM 1960 West Road | Houston | TX
701 | Keller Parkway | Keller | TX',header= T, sep = '|')
library(RJSONIO)
## here the use of json in placee of xml
## the static part of the url request
url.base <- "http://maps.googleapis.com/maps/api/geocode/json?address="
## I create a data.frame with your formatted data
dat2 <- data.frame(
streetnumber1 = paste(dat$streetnumber,sep=""),
street2 = paste(gsub(" ","+",dat$street),sep=""),
city1 = paste(dat$city,sep=""),
state1 = paste(dat$state,sep=""))
## I use apply here to call it for each row
apply(dat2,1, function(x){
url<-paste(url.base,x[1],"+",x[2],
",+",x[3],",+",x[4],"&sensor=false",sep="")
res <- fromJSON(url) ## single statement
## e. to get lat/long
lat.long <- res$results[[1]]$geometry$bounds$northeast
})
res
这里只是一个列表。您可以轻松地对其进行加法和解析。