带地址的R地理编码

时间:2017-05-31 17:12:30

标签: r geocoding ggmap

我有32K行地址,我必须找到长/纬度值。

我使用找到的代码here。我非常感谢这个人创造它,但我有一个问题:

我想编辑它,以便如果循环遇到当前行地址的问题,它只会在Lat / Long字段中声明NA并移动到下一个字段。有谁知道如何实现这一目标?代码如下:

# Geocoding a csv column of "addresses" in R

#load ggmap
library(ggmap)

# Select the file from the file chooser
fileToLoad <- file.choose(new = TRUE)

# Read in the CSV data and store it in a variable 
origAddress <- read.csv(fileToLoad, stringsAsFactors = FALSE)

# Initialize the data frame
geocoded <- data.frame(stringsAsFactors = FALSE)

# Loop through the addresses to get the latitude and longitude of each address and add it to the
# origAddress data frame in new columns lat and lon
for(i in 1:nrow(origAddress))
{
  # Print("Working...")
  result <- geocode(origAddress$addresses[i], output = "latlona", source = "google")
  origAddress$lon[i] <- as.numeric(result[1])
  origAddress$lat[i] <- as.numeric(result[2])
  origAddress$geoAddress[i] <- as.character(result[3])
}
# Write a CSV file containing origAddress to the working directory
write.csv(origAddress, "geocoded.csv", row.names=FALSE)

1 个答案:

答案 0 :(得分:5)

您可以使用tryCatch()来隔离地理编码警告,并返回一个与geocode()将返回相同结构(lon,lat,地址)的data.frame。

您的代码将是

# Geocoding a csv column of "addresses" in R

# load ggmap
library(ggmap)

# Select the file from the file chooser
fileToLoad <- file.choose(new = TRUE)

# Read in the CSV data and store it in a variable 
origAddress <- read.csv(fileToLoad, stringsAsFactors = FALSE)

# Loop through the addresses to get the latitude and longitude of each address and add it to the
# origAddress data frame in new columns lat and lon
for(i in 1:nrow(origAddress)) {
  result <- tryCatch(geocode(origAddress$addresses[i], output = "latlona", source = "google"),
                     warning = function(w) data.frame(lon = NA, lat = NA, address = NA))
  origAddress$lon[i] <- as.numeric(result[1])
  origAddress$lat[i] <- as.numeric(result[2])
  origAddress$geoAddress[i] <- as.character(result[3])
}
# Write a CSV file containing origAddress to the working directory
write.csv(origAddress, "geocoded.csv", row.names=FALSE)

或者,您可以更快,更干净地执行此操作,而无需进行循环和错误检查。但是,如果没有可重现的数据示例,则无法知道这是否会保留您需要的所有信息。

# Substituted for for loop
result <- geocode(origAddress$addresses, output = "latlona", source = "google")
origAddress <- cbind(origAddress$addresses, result)