查找R数据框中位置之间的距离

时间:2016-11-01 16:14:04

标签: r

我有一个包含以下行和列的数据框,我想找到county_a和county_b之间的距离

  county_a county_b     lat_a      lon_a winner_a     lat_b      lon_b winner_b
1    01001    01001 32.536382 -86.644490      rep 32.536382 -86.644490      rep
2    01003    01001 30.659218 -87.746067      rep 32.536382 -86.644490      rep
3    01005    01001 31.870670 -85.405456      rep 32.536382 -86.644490      rep
4    01007    01001 33.015893 -87.127148      rep 32.536382 -86.644490      rep
5    01009    01001 33.977448 -86.567246      rep 32.536382 -86.644490      rep
6    01011    01001 32.101759 -85.717261      dem 32.536382 -86.644490      rep

我已尝试以下操作并收到错误(均在下方):

library(geosphere)
library(RJDBC) # Not sure this was used for this but it comes up earlier in the program
library(dplyr)
df%>%mutate(dist = distm(c(lon_a,lat_a), c(lon_b, lat_b), fun=distHaversine))

错误:Error in eval(substitute(expr), envir, enclos) : Wrong length for a vector, should be 2

提前感谢您的帮助!

2 个答案:

答案 0 :(得分:1)

如果您无法弄清楚如何使用R包装中的固定功能,您可以随时定义自己的Haversine公式:

gcd.slc <- function(long1, lat1, long2, lat2) {
    R <- 6371 # Earth mean radius [km]
    d <- acos(sin(lat1)*sin(lat2) + cos(lat1)*cos(lat2) * cos(long2-long1)) * R
    return(d) # Distance in km
}

该函数使用余弦的球面定律,用纬度和经度找到两点之间的距离。

参考:https://www.google.com.sg/amp/s/www.r-bloggers.com/great-circle-distance-calculations-in-r/amp/?client=ms-android-samsung

答案 1 :(得分:1)

您需要以矩阵形式给出参数,每个参数有两列。因此,请使用cbind代替c

df <- read.table(text="  county_a county_b     lat_a      lon_a winner_a     lat_b      lon_b winner_b
1    01001    01001 32.536382 -86.644490      rep 32.536382 -86.644490      rep
2    01003    01001 30.659218 -87.746067      rep 32.536382 -86.644490      rep
3    01005    01001 31.870670 -85.405456      rep 32.536382 -86.644490      rep
4    01007    01001 33.015893 -87.127148      rep 32.536382 -86.644490      rep
5    01009    01001 33.977448 -86.567246      rep 32.536382 -86.644490      rep
6    01011    01001 32.101759 -85.717261      dem 32.536382 -86.644490      rep")

library(dplyr)
library(geosphere)
df %>% mutate(dist = distHaversine(cbind(lon_a, lat_a), cbind(lon_b, lat_b)))

这会给你:

  county_a county_b    lat_a     lon_a winner_a    lat_b     lon_b winner_b      dist
1     1001     1001 32.53638 -86.64449      rep 32.53638 -86.64449      rep      0.00
2     1003     1001 30.65922 -87.74607      rep 32.53638 -86.64449      rep 233609.47
3     1005     1001 31.87067 -85.40546      rep 32.53638 -86.64449      rep 138247.91
4     1007     1001 33.01589 -87.12715      rep 32.53638 -86.64449      rep  69929.04
5     1009     1001 33.97745 -86.56725      rep 32.53638 -86.64449      rep 160579.78
6     1011     1001 32.10176 -85.71726      dem 32.53638 -86.64449      rep  99747.13