Question

我有两个数据集，其中包含一组邮政编码以及他们的＆＃34; Lat＆＃34;和＆＃34; Lon＆＃34;。我想为一个数据集中的所有邮政编码创建一个距离矩阵，其中包含所有其他数据集的邮政编码。

df.postcodes <- data.frame(name = c("21075", "20099", "33613"),lat  = c( 53.459940,  53.5580847,52.0454598),lon = c(9.9288308,10.0119789,8.5196291))
df.postcodes1 <- data.frame(name = c("210751", "200991"),lat  = c( 55.459940,  52.5580847),
                       lon = c(10.9288308,11.0119789))

这是样本数据集，所以基本上我想为df.postcodes1中的所有邮政编码创建一个距离矩阵，所有邮政编码都在df.postcodes中，然后返回最近的邮政编码。我听说过这个包Imap，但我无法从中创建矩阵。

Answer 1

基本上我使用包gdist中的Imap来计算2点之间的地理距离。

要获得集合A的所有点与集合B中的点之间的距离矩阵，您可以使用outer（或expand.grid，但outer更好，因为您需要结果矩阵）。 outer将为您生成所有索引（2套笛卡尔积）。

最后你应该对gdist进行矢量化，因为outer已经过了向量化函数。我使用mapply执行此操作（您可以使用Vectorize）。

library(Imap)

## a vectorized version of `gdist`
## x and y are vectors of index 
dist_imap <- 
function(x,y){
  p1 <- df.postcodes[x,]
  p2 <- df.postcodes1[y,]
  mapply(gdist,p1$lon,p1$lat,p2$lon,p2$lat)
}
## Use index of rows since we have to loop over data.frames
X <- seq_len(nrow(df.postcodes))
Y <- seq_len(nrow(df.postcodes1))
## outer will generate all comobination of index 
## and pply the vectorized function already created.
res <- outer(X,Y,dist_imap)
## naming for pretty output
rownames(res)  <- df.postcodes$name
colnames(res) <- df.postcodes1$name

#         210751   200991
# 21075 125.2018 66.91572
# 20099 118.7207 70.15158
# 33613 222.3866 96.82441

使用imap的邮政编码距离

1 个答案: