我有一个数据集,我正在尝试将组位置移到最近的城市。我有数据集1(df1),其中包含经度和纬度的地址位置。我想将这些地址映射到半径50英里以内的所有最近的城市(在数据框df2中)。
g_lat <- c(45.52306, 40.26719, 34.05223, 37.38605, 37.77493)
g_lon <- c(-122.67648,-86.13490, -118.24368, -122.08385, -122.41942)
address <- c(1,2,3,4,5)
df1 <- data.frame(g_lat, g_lon, address)
g_lat <- c(+37.7737185, +45.5222208,+37.77493)
g_lon <- c(-122.2744317,-098.7041549,-122.41942)
msa <- c(1,2,3)
df2 <- data.frame(g_lat, g_lon, msa)
我希望输出如下,显示该地址与之关联的所有msa:
address g_lat g_lon msa
5 37.77493 -122.41942 1
5 37.77493 -122.41942 3
请让我知道如何实现。我尝试了以下方法:
library(geosphere)
# create distance matrix
mat <- distm(df1[,c('g_lon','g_lat')], df2[,c('g_lon','g_lat')], fun=distVincentyEllipsoid)
error:
Error in .pointsToMatrix(y) : longitude < -360
# assign the name to the point in list1 based on shortest distance in the matrix
df1$locality <- df2$locality[max.col(-mat)]
答案 0 :(得分:1)
可能的解决方案:
library(geosphere)
mat <- distm(df1[,c('g_lon','g_lat')], df2[,c('g_lon','g_lat')], fun=distVincentyEllipsoid)
ri <- row(mat)[mat < 80000]
ci <- col(mat)[mat < 80000]
df3 <- df1[ri,]
df3$msa <- df2[ci, "msa"]
给出:
> df3 g_lat g_lon address msa 4 37.38605 -122.0838 4 1 5 37.77493 -122.4194 5 1 4.1 37.38605 -122.0838 4 3 5.1 37.77493 -122.4194 5 3
使用data.table或dplyr:
library(data.table)
setDT(df1)[ri][, msa := df2[ci, "msa"]][]
library(dplyr)
df1 %>%
slice(ri) %>%
mutate(msa = df2[ci, "msa"])
您可以使用以下方法添加距离:
df3$dist <- mat[cbind(ri, ci)]
给出:
> df3 g_lat g_lon address msa dist 4 37.38605 -122.0838 4 1 46202.74 5 37.77493 -122.4194 5 1 12774.31 4.1 37.38605 -122.0838 4 3 52359.08 5.1 37.77493 -122.4194 5 3 0.00