我正在尝试合并两个数据帧(主要和次要)。我希望基于距离或更佳的距离将“子”中的“可变”数据合并到“主”中,无论哪个“子”行/站点最接近“主”行/站点。
library(sf)
a <- structure(list(`Site#` = c("Site1", "Site2", "Site3", "Site4", "Site5", "Site6"), Longitude = c(-94.609, -98.1391, -99.033, -98.49, -96.4309, -95.99), `Latitude` = c(38.922, 37.486111, 37.811, 38.364, 39.4402, 39.901)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
main <- st_as_sf(a, coords = c("Longitude", "Latitude"), crs = 4326)
b <- structure(list(Longitude = c(-98.49567, -96.22451, -98.49567, -98.941391, -95.91411, -99.031113), `Latitude` = c(38.31264,39.97692, 38.31264, 37.486111, 39.92143, 37.814171), Variable = c(400, 50, 100, 201, 99, 700)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
sub <- st_as_sf(b, coords = c("Longitude", "Latitude"), crs = 4326)
c <- st_intersection(main,sub)
c <- st_is_within_distance(main,sub,dist=0.001)
我相信st_intersection是我想要的,但是如果我可以基于距离一对一地进行操作,那将使它工作。有人知道可以提供我想要的结果吗?
答案 0 :(得分:5)
st_join()
允许单步加入:
st_join(main, sub, join = st_nearest_feature, left = T)
#> although coordinates are longitude/latitude, st_nearest_feature assumes that they are planar
#> Simple feature collection with 6 features and 2 fields
#> geometry type: POINT
#> dimension: XY
#> bbox: xmin: -99.033 ymin: 37.48611 xmax: -94.609 ymax: 39.901
#> epsg (SRID): 4326
#> proj4string: +proj=longlat +datum=WGS84 +no_defs
#> # A tibble: 6 x 3
#> `Site#` geometry Variable
#> <chr> <POINT [°]> <dbl>
#> 1 Site1 (-94.609 38.922) 99
#> 2 Site2 (-98.1391 37.48611) 201
#> 3 Site3 (-99.033 37.811) 700
#> 4 Site4 (-98.49 38.364) 400
#> 5 Site5 (-96.4309 39.4402) 50
#> 6 Site6 (-95.99 39.901) 99
由reprex package(v0.3.0)于2020-01-19创建
答案 1 :(得分:3)
这是我尝试过的。似乎您需要st_nearest_feature()
,它获取最近的特征的索引。一旦有了索引,就可以将它们添加到main
中。您还可以向b
添加行号(索引)。然后,您要处理加入。
library(dplyr)
library(sf)
# Which feature in y is closest to each feature in x?
# You get row index
st_nearest_feature(x = main, y = sub)
# Add the index number to main.
mutate(main, ind = st_nearest_feature(x = main, y = sub)) -> main
# Add row numbers (index) to b
mutate(b, ind = 1:n()) -> b
left_join(main, b, by = "ind")
# `Site#` geometry ind Longitude Latitude Variable
# <chr> <POINT [°]> <int> <dbl> <dbl> <dbl>
#1 Site1 (-94.609 38.922) 5 -95.9 39.9 99
#2 Site2 (-98.1391 37.48611) 4 -98.9 37.5 201
#3 Site3 (-99.033 37.811) 6 -99.0 37.8 700
#4 Site4 (-98.49 38.364) 1 -98.5 38.3 400
#5 Site5 (-96.4309 39.4402) 2 -96.2 40.0 50
#6 Site6 (-95.99 39.901) 5 -95.9 39.9 99