按R中的因子计算多边形内的点

时间:2016-11-06 23:25:07

标签: r count spatial

我有两个xy坐标数据集。第一个有xy坐标加上带有我的因子水平的标签列。我致电data.frame qq,看起来像这样:

structure(list(x = c(5109, 5128, 5137, 5185, 5258, 5324, 5387, 
5343, 5331, 5347, 5300, 5180, 4109, 4082, 4091, 4139, 4212, 4279, 
4291, 4297, 4285, 4301, 4254, 4181), y = c(1692, 1881, 2070, 
2119, 2144, 2065, 1987, 1813, 1705, 1649, 1631, 1654, 1847, 2015, 
2204, 2253, 2278, 2282, 2166, 1947, 1839, 1783, 1765, 1783), 
    tag = c("MPN_right", "MPN_right", "MPN_right", "MPN_right", 
    "MPN_right", "MPN_right", "MPN_right", "MPN_right", "MPN_right", 
    "MPN_right", "MPN_right", "MPN_right", "MPN_left", "MPN_left", 
    "MPN_left", "MPN_left", "MPN_left", "MPN_left", "MPN_left", 
    "MPN_left", "MPN_left", "MPN_left", "MPN_left", "MPN_left"
    )), .Names = c("x", "y", "tag"), row.names = c(NA, -24L), class = "data.frame") 

我使用qq xy表示使用大sd表示另一个的随机数据。

set.seed(123)
my_points=data.frame(x=rnorm(n =1000,mean=mean(qq$x),sd=1000),
y=rnorm(n=1000,mean=mean(qq$y),sd=1000))

如果我使用in.out包中的mgcv函数,我会得到一些我想要的东西。

这种方法的主要问题是我的' Polygon'未关闭也不会被因子解释为2个多边形。该软件包建议在其间使用一个NA行,但我更倾向于使用我的标记列,因为我将尝试在我的标记因子中使用2个以上的级别,即超过2个多边形。我的最终目标是制作一个包含每个点内点数的表格。

3 个答案:

答案 0 :(得分:2)

怎么样:

mysppoint <- SpatialPoints(coords = my_points)  # create spatial points
qq$tag <- as.factor(qq$tag)
polys = list()

# create one polygon for each factor level
for (lev in levels(qq$tag)){
  first_x <- qq$x[qq$tag == lev][1]
  first_y <- qq$y[qq$tag == lev][1]
  qq <- rbind(qq, data.frame(x = first_x, y = first_y, tag = lev))  # "close" the polygon by replicating the first row
  polys[[lev]] <- Polygons(list(Polygon(matrix(data = cbind(qq$x[qq$tag == lev], # transform to polygon
                                                            qq$y[qq$tag == lev]), 
                                               ncol = 2))), lev)
}

mypolys <-  SpatialPolygons(polys)   # convert to spatial polygons
inters  <-  factor(over(mysppoint, mypolys), labels = names(mypolys)) # intersect points with polygons
table(inters)

,它给出了:

inters
 MPN_left MPN_right 
       10        17 

这样做的好处是它可以为您提供合适的空间对象。例如:

plotd <- fortify(mypolys )
p <- ggplot()
p <- p + geom_point(data = my_points, aes(x = x , y = y), size = 0.2)
p <- p + geom_polygon(data = plotd, aes(x = long, y = lat, fill = id), alpha = 0.7)
p

Plot of polygons and points

答案 1 :(得分:1)

lapply()sapply()可帮助您使用功能等级。

  ## a bit edited to make output clear

library(dplyr); library(mgcv)

TAG <- unique(qq$tag)

IN.OUT <- lapply(TAG, function(x) as.matrix(qq[qq$tag==x, 1:2])) %>%  # make a matrix par level
  sapply(function(x) in.out(x, as.matrix(my_points)))        # use in.out() with each matrix

colnames(IN.OUT) <- TAG

head(IN.OUT, n = 3)

#      MPN_right MPN_left
# [1,]     FALSE    FALSE
# [2,]     FALSE    FALSE
# [3,]     FALSE    FALSE

apply(IN.OUT, 2, table)

#       MPN_right MPN_left
# FALSE       983      990
# TRUE         17       10

答案 2 :(得分:1)

我最终使用了deb http://ppa.launchpad.net/fenics-packages/fenics-exp/ubuntu trusty main 以及lapply和更多split的组合。所以这里是代码,请忽略lapply辅助函数,它基本上给了extract_coords dataframex和标记列。我还设法对原始y中的点进行子集化并对它们进行计数(将它们作为向量而不是表格返回)。

your_coords