我正在使用geocode
包中的ggmap
函数对国家/地区名称进行地理编码,然后将其传递到distHaversine
库中的geosphere
以计算两者之间的距离国家。
我的数据样本如下:
Country.Value Address.Country
1: United States United States
2: Cyprus United States
3: Indonesia United States
4: Tanzania Tanzania
5: Madagascar United States
6: Belize Canada
7: Argentina Argentina
8: Egypt Egypt
9: South Africa South Africa
10: Paraguay Paraguay
我还使用if-else语句尝试保持在免费的 Google Maps地理编码器设置的地理编码限制范围内。我的代码如下:
for(i in 1:nrow(df)) {
row<-df.cont.long[i,]
src_lon<- 0.0
src_lat<- 0.0
trgt_lon<- 0.0
trgt_lat<- 0.0
if((row$Country.Value=='United States')){ #Reduce geocoding requirements
trgt_lon<- -95.7129
trgt_lat<- 37.0902
}
else if((row$Address.Country=='United States')){ #Reduce Geocoding Requirements
src_lon<- -95.7129
src_lat<- 37.0902
}
else if((row$Country.Value=='Canada')){ #Reduce geocoding requirements
trgt_lon<- -106.3468
trgt_lat<- 56.1304
}
else if((row$Primary.Address.Country=='Canada')){ #Reduce Geocoding Requirements
src_lon<- -106.3468
src_lat<- 56.1304
}
else if(row$Country.Value == row$Address.Country){ #Reduce Geocoding Requirements
# trgt<-geocode(row$Country.Value)
# trgt_lon<-as.numeric(trgt$lon)
# trgt_lat<-as.numeric(trgt$lat)
# src_lon<-as.numeric(trgt$lon)
# src_lat<-as.numeric(trgt$lat)
}
else{
trgt<-geocode(row$Country.Value, output=c("latlon"))
trgt_lon<-as.numeric(trgt$lon)
trgt_lat<-as.numeric(trgt$lat)
src<-geocode(row$Address.Country)
src_lon<-as.numeric(src$lon)
src_lat<-as.numeric(src$lat)
}
print(i)
print(c(row$Address.Country, src_lon, src_lat))
print(c(row$Country.Value, trgt_lon, trgt_lat))
print(distHaversine( p1=c(as.numeric(src$lon), as.numeric(src$lat)), p2=c(as.numeric(trgt$lon), as.numeric(trgt$lat)) ))
}
在输出中
我不知道代码出错的地方。
此外,取消注释我检查 Country.Value 和 Address.Country 是否相等的行会使事情变得更糟。
答案 0 :(得分:2)
您正在使用的功能是矢量化的,所以您真正需要的是
library(ggmap)
library(geosphere)
distHaversine(geocode(as.character(df$Country.Value)),
geocode(as.character(df$Address.Country)))
# [1] 0 10432624 14978567 0 15868544 4588708 0 0 0 0
请注意as.character
是因为ggmap::geocode
不喜欢因素。结果很有意义:
df$distance <- distHaversine(geocode(as.character(df$Country.Value), source = 'dsk'),
geocode(as.character(df$Address.Country), source = 'dsk'))
df
# Country.Value Address.Country distance
# 1 United States United States 0
# 2 Cyprus United States 10340427
# 3 Indonesia United States 14574480
# 4 Tanzania Tanzania 0
# 5 Madagascar United States 16085178
# 6 Belize Canada 5172279
# 7 Argentina Argentina 0
# 8 Egypt Egypt 0
# 9 South Africa South Africa 0
# 10 Paraguay Paraguay 0
如果您不想使用ggmap::geocode
,tmap::geocode_OSM
是使用OpenStreetMap数据的另一种地理编码功能。但是,因为它没有矢量化,所以你需要按列迭代它:
distHaversine(t(sapply(df$Country.Value, function(x){tmap::geocode_OSM(x)$coords})),
t(sapply(df$Address.Country, function(x){tmap::geocode_OSM(x)$coords})))
# [1] 0 10448111 14794618 0 16110917 5156823 0 0 0 0
或rowwise:
apply(df, 1, function(x){distHaversine(tmap::geocode_OSM(x['Country.Value'])$coords,
tmap::geocode_OSM(x['Address.Country'])$coords)})
# [1] 0 10448111 14794618 0 16110917 5156823 0 0 0 0
和coords
数据的子集。另请注意,Google,DSK和OSM都为每个国家/地区选择不同的中心,因此距离会有所不同。