我有两个xy坐标数据集。第一个有xy坐标加上带有我的因子水平的标签列。我致电data.frame
qq
,看起来像这样:
structure(list(x = c(5109, 5128, 5137, 5185, 5258, 5324, 5387,
5343, 5331, 5347, 5300, 5180, 4109, 4082, 4091, 4139, 4212, 4279,
4291, 4297, 4285, 4301, 4254, 4181), y = c(1692, 1881, 2070,
2119, 2144, 2065, 1987, 1813, 1705, 1649, 1631, 1654, 1847, 2015,
2204, 2253, 2278, 2282, 2166, 1947, 1839, 1783, 1765, 1783),
tag = c("MPN_right", "MPN_right", "MPN_right", "MPN_right",
"MPN_right", "MPN_right", "MPN_right", "MPN_right", "MPN_right",
"MPN_right", "MPN_right", "MPN_right", "MPN_left", "MPN_left",
"MPN_left", "MPN_left", "MPN_left", "MPN_left", "MPN_left",
"MPN_left", "MPN_left", "MPN_left", "MPN_left", "MPN_left"
)), .Names = c("x", "y", "tag"), row.names = c(NA, -24L), class = "data.frame")
我使用qq
xy
表示使用大sd
表示另一个的随机数据。
set.seed(123)
my_points=data.frame(x=rnorm(n =1000,mean=mean(qq$x),sd=1000),
y=rnorm(n=1000,mean=mean(qq$y),sd=1000))
如果我使用in.out
包中的mgcv
函数,我会得到一些我想要的东西。
这种方法的主要问题是我的' Polygon'未关闭也不会被因子解释为2个多边形。该软件包建议在其间使用一个NA行,但我更倾向于使用我的标记列,因为我将尝试在我的标记因子中使用2个以上的级别,即超过2个多边形。我的最终目标是制作一个包含每个点内点数的表格。
答案 0 :(得分:2)
怎么样:
mysppoint <- SpatialPoints(coords = my_points) # create spatial points
qq$tag <- as.factor(qq$tag)
polys = list()
# create one polygon for each factor level
for (lev in levels(qq$tag)){
first_x <- qq$x[qq$tag == lev][1]
first_y <- qq$y[qq$tag == lev][1]
qq <- rbind(qq, data.frame(x = first_x, y = first_y, tag = lev)) # "close" the polygon by replicating the first row
polys[[lev]] <- Polygons(list(Polygon(matrix(data = cbind(qq$x[qq$tag == lev], # transform to polygon
qq$y[qq$tag == lev]),
ncol = 2))), lev)
}
mypolys <- SpatialPolygons(polys) # convert to spatial polygons
inters <- factor(over(mysppoint, mypolys), labels = names(mypolys)) # intersect points with polygons
table(inters)
,它给出了:
inters
MPN_left MPN_right
10 17
这样做的好处是它可以为您提供合适的空间对象。例如:
plotd <- fortify(mypolys )
p <- ggplot()
p <- p + geom_point(data = my_points, aes(x = x , y = y), size = 0.2)
p <- p + geom_polygon(data = plotd, aes(x = long, y = lat, fill = id), alpha = 0.7)
p
答案 1 :(得分:1)
lapply()
和sapply()
可帮助您使用功能等级。
## a bit edited to make output clear
library(dplyr); library(mgcv)
TAG <- unique(qq$tag)
IN.OUT <- lapply(TAG, function(x) as.matrix(qq[qq$tag==x, 1:2])) %>% # make a matrix par level
sapply(function(x) in.out(x, as.matrix(my_points))) # use in.out() with each matrix
colnames(IN.OUT) <- TAG
head(IN.OUT, n = 3)
# MPN_right MPN_left
# [1,] FALSE FALSE
# [2,] FALSE FALSE
# [3,] FALSE FALSE
apply(IN.OUT, 2, table)
# MPN_right MPN_left
# FALSE 983 990
# TRUE 17 10
答案 2 :(得分:1)
我最终使用了deb http://ppa.launchpad.net/fenics-packages/fenics-exp/ubuntu trusty main
以及lapply
和更多split
的组合。所以这里是代码,请忽略lapply
辅助函数,它基本上给了extract_coords
dataframe
,x
和标记列。我还设法对原始y
中的点进行子集化并对它们进行计数(将它们作为向量而不是表格返回)。
your_coords