如何使用R在城市名称向量中找到每个城市的县?

时间:2016-04-18 05:04:38

标签: r geolocation

鉴于一串城市名称,如何使用R找到每个城市所属的县?我查看了mapacs软件包,但我对它们没有经验。目标是找到与我的数据中的城市相关联的县级数据。

说你有以下内容:

city <- c("RALEIGH", "HOLLYWOOD", "DALLAS", "MOUNTAIN VIEW", "OKLAHOMA CITY", "ORLANDO")
state <- c("NC", "CA", "TX", "CA", "OK", "FL")

1 个答案:

答案 0 :(得分:2)

&#34;您可以从GeoNames.org以制表符分隔值格式获取城市/州信息。数据免费,全面且结构合理。对于美国数据,请在免费邮政编码数据页面上获取US.txt文件。该页面上的readme.txt文件描述了格式。&#34; See post by Joshua Frank

## Download the file

temp <- tempfile()
download.file("http://download.geonames.org/export/zip/US.zip",temp)
con <- unz(temp, "US.txt")
US <- read.delim(con, header=FALSE)
unlink(temp)

## Find state and county

colnames(US)[c(3,5,6)] <- c("city","state","county")
US$city <- tolower(US$city)
myCityNames <- tolower(c("RALEIGH", "HOLLYWOOD", "DALLAS", "MOUNTAIN VIEW","OKLAHOMA CITY", "ORLANDO"))
myCities <- US[US$city %in% myCityNames, ]
myCities <- myCities[c("city","state","county")]
myCities <- myCities[!duplicated(myCities),]
myCities <- myCities[order(myCities$city, myCities$state, decreasing = TRUE), ]

问题是在不同的州有多个同名的城市。

如果您准确了解您提到的州内的城市,这可能有所帮助:

myPlaces <- data.frame(city = myCityNames, state = c("NC", "CA", "TX", "CA", "OK", "FL"))
merge(myCities, myPlaces, by = c("city", "state") ,all.y=TRUE)