我有两个从txt文件导入的数据帧 - 采样点和站点位置。
采样点数据框
X Y Z
346449.30 576369.65 86.93
346449.55 576368.24 87.16
346449.29 576368.17 79.08
346449.83 576366.86 88.23
346449.97 576365.42 84.97
346449.91 576362.22 86.59
346449.74 576363.65 88.87
346449.61 576363.59 84.99
346449.50 576363.54 81.33
电台位置数据框
Station x y
1 346479.720 576349.710
2 346575.380 576361.530
3 346685.540 576303.180
4 346722.820 576412.680
5 346514.780 576406.140
6 346813.130 576435.830
7 346748.880 576304.090
8 346825.830 576402.800
所以我想知道如何找到并标记位于缓冲区内的点(来自采样数据帧)(例如,从第二个数据帧的每个站产生的3米缓冲半径)?
这就是我想得到的:
X Y Z Station
346449.30 576369.65 86.93 1
346449.55 576368.24 87.16 1
346449.29 576368.17 79.08 1
346449.83 576366.86 88.23 2
346449.97 576365.42 84.97 2
346449.91 576362.22 86.59 3
346449.74 576363.65 88.87 4
346449.61 576363.59 84.99 5
346449.50 576363.54 81.33 5
346449.51 576365.07 89.38 5
346449.36 576365.01 84.93 5
346449.24 576366.46 88.70 5
346448.93 576367.83 86.75 5
我是R的新人,所以任何帮助表示赞赏。感谢。
答案 0 :(得分:1)
如果您只是想在采样数据点的3米范围内将最近的站点的ID添加到您的采样数据。框架中将有一个解决方案:
# get a matrix with the squares of the euclidian distances
mx <- outer(seq(nrow(sampleData)),
seq(nrow(stations)),
# return the square of the euclidian distance
function(i,j){
(sampleData[i,'X'] - stations[j,'x'])^2 +
(sampleData[i,'Y'] - stations[j,'y'])^2
})
# maximum distance to consider
d = 3
# get rid of distances greater than 3 meters away
mx[mx>d^2] <- NA
index <- apply(mx,
1,
# returns the number of the nearest row in `stations` that is less than 3 meters away
function(x){
if(all(is.na(x)))
return(NA)
x[is.na(x)] <- F
which.max( x == min(x,na.rm=T) )
})
sampleData$station <- stations$station[indx]
# a comma delimited list of stations with distance < 3
sampleData$closeStations <- apply(mx,
1,
# returns the number of the nearest row in `stations` that is less than 3 meters away
function(x){
if(all(is.na(x)))
return(NA)
paste0(stations$Station[x],sep = ',')
})
使用outer
和apply
可能会使解决方案运行得更快,但如果您遇到问题,则可能更容易使用for
循环进行调试:
# maximum distance to consider
d = 3
distanceToNearestStation <-
nearestStation <- numeric(0)
nearestStations <- character(0)
for(i in seq(nrow(sampleData))){
# square of the euclidian distances from this data point to the stations
distances <- sqrt((sampleData[i,'X'] - stations[,'x'])^2 +
(sampleData[i,'Y'] - stations[,'y'])^2 )
# get rid of distances greater than 3 meters away
# distances[distances>d] <- NA
# all the stations are too far away or something is wrong with this data point
if(all(is.na(distances)))
next
# record the nearest station to this data point
distanceToNearestStation[i] <- min(distances,na.rm=T)
nearestStation[i] <- which.max( distances == min(distances,na.rm=T) )
# comma delimeted list of stations within 3 meters
distanceIsClose <- distance < 3
distanceIsClose[is.na(distanceIsClose)] <- F
nearestStations[i] <- paste0(paste0(stations$Station[distanceIsClose],sep = ','))
}
range(distanceToNearestStation)
sampleData$station <- stations$station[nearestStation]
# number of data points within 3 meters of a station
table(distanceToNearestStation <= 3)
# data points within 3 meters of a station
subset <- sampleData[distanceToNearestStation<= 3,]
# save to individual files.
for(s in unique(subset$station))
write.csv(subset[subset$station == s,],
file.path('My/Favorite/Directory'# note there is no trailing slash
,paste('station',s,'data.csv')))