我的数据集包含对象的位置:
so <- data.frame(x = rep(c(1:5), each = 5), y = rep(1:5, 5))
so1 <- so %>% mutate(x = x + 5, y = y +2)
so2 <- rbind(so, so1) %>% mutate(x = x + 13, y = y + 7)
so3 <- so2 %>% mutate(x = x + 10)
ggplot(aes(x = x, y = y), data = rbind(so, so1, so2, so3)) + geom_point()
我想知道的是,如果R中有一个方法可以检测到对象位于数据集的外行,因为我必须从分析中排除这些对象。我想在图片中排除红色对象
到目前为止,我使用了min
,max
和ifelse
,但这是一个很好的,我无法创建可以推广到具有不同x和y设计的不同数据集的内容。
有没有package
做这件事?或/并且有可能解决这样的问题吗?
答案 0 :(得分:4)
您可以使用“空间”方法吗? 将您的数据可视化为空间对象,您的问题将变为删除修补程序的边框...
使用包raster
可以非常直接地完成此操作:相应地找到boundaries
和mask
数据。
library(dplyr)
library(raster)
# Your reproducible example
myDF = rbind(so,so1,so2,so3)
myDF$z = 1 # there may actually be more 'z' variables
# Rasterize your data
r = rasterFromXYZ(myDF) # if there are more vars, this will be a RasterBrick
par(mfrow=c(2,2))
plot(r, main='Original data')
# Here I artificially add 1 row above and down and 1 column left and right,
# This is a trick needed to make sure to also remove the cells that are
# located at the border of your raster with `boundaries` in the next step.
newextent = extent(r) + c(-res(r)[1], res(r)[1], -res(r)[2], res(r)[2] )
r = extend(r, newextent)
plot(r, main='Artificially extended')
plot(rasterToPoints(r, spatial=T), add=T, col='blue', pch=20, cex=0.3)
# Get the cells to remove, i.e. the boundaries
bounds = boundaries(r[[1]], asNA=T) #[[1]]: in case r is a RasterBrick
plot(bounds, main='Cells to remove (where 1)')
plot(rasterToPoints(bounds, spatial=T), add=T, col='red', pch=20, cex=0.3)
# Then mask your data (i.e. subset to remove boundaries)
subr = mask(r, bounds, maskvalue=1)
plot(subr, main='Resulting data')
plot(rasterToPoints(subr, spatial=T), add=T, col='blue', pch=20, cex=0.3)
# This is your new data (the added NA's are not translated so it's OK)
myDF2 = rasterToPoints(subr)
对你有帮助吗?