我找到了两种主要的方法来查看一个点是否属于多边形。一种是使用光线追踪方法here,这是最推荐的答案,另一种是使用matplotlib path.contains_points
(这对我来说似乎有些模糊)。我将不得不连续检查很多点。有人知道这两个中的任何一个是否比另一个更值得推荐,或者是否有更好的第三选择?
更新:
我检查了两种方法,matplotlib看起来更快。
from time import time
import numpy as np
import matplotlib.path as mpltPath
# regular polygon for testing
lenpoly = 100
polygon = [[np.sin(x)+0.5,np.cos(x)+0.5] for x in np.linspace(0,2*np.pi,lenpoly)[:-1]]
# random points set of points to test
N = 10000
points = zip(np.random.random(N),np.random.random(N))
# Ray tracing
def ray_tracing_method(x,y,poly):
n = len(poly)
inside = False
p1x,p1y = poly[0]
for i in range(n+1):
p2x,p2y = poly[i % n]
if y > min(p1y,p2y):
if y <= max(p1y,p2y):
if x <= max(p1x,p2x):
if p1y != p2y:
xints = (y-p1y)*(p2x-p1x)/(p2y-p1y)+p1x
if p1x == p2x or x <= xints:
inside = not inside
p1x,p1y = p2x,p2y
return inside
start_time = time()
inside1 = [ray_tracing_method(point[0], point[1], polygon) for point in points]
print "Ray Tracing Elapsed time: " + str(time()-start_time)
# Matplotlib mplPath
start_time = time()
path = mpltPath.Path(polygon)
inside2 = path.contains_points(points)
print "Matplotlib contains_points Elapsed time: " + str(time()-start_time)
给出,
Ray Tracing Elapsed time: 0.441395998001
Matplotlib contains_points Elapsed time: 0.00994491577148
使用三角形而不是100边多边形获得相同的相对差异。我也会检查一下,因为它看起来只是专门针对这类问题的包
答案 0 :(得分:47)
您可以考虑shapely:
from shapely.geometry import Point
from shapely.geometry.polygon import Polygon
point = Point(0.5, 0.5)
polygon = Polygon([(0, 0), (0, 1), (1, 1), (1, 0)])
print(polygon.contains(point))
从您提到的方法我只使用了第二个path.contains_points
,它运行正常。在任何情况下,根据测试所需的精度,我建议创建一个numpy bool网格,其中多边形内的所有节点都为True(否则为False)。如果你打算对很多点进行测试,这可能会更快(虽然注意到这依赖于你在“像素”容差范围内进行测试):
from matplotlib import path
import matplotlib.pyplot as plt
import numpy as np
first = -3
size = (3-first)/100
xv,yv = np.meshgrid(np.linspace(-3,3,100),np.linspace(-3,3,100))
p = path.Path([(0,0), (0, 1), (1, 1), (1, 0)]) # square with legs length 1 and bottom left corner at the origin
flags = p.contains_points(np.hstack((xv.flatten()[:,np.newaxis],yv.flatten()[:,np.newaxis])))
grid = np.zeros((101,101),dtype='bool')
grid[((xv.flatten()-first)/size).astype('int'),((yv.flatten()-first)/size).astype('int')] = flags
xi,yi = np.random.randint(-300,300,100)/100,np.random.randint(-300,300,100)/100
vflag = grid[((xi-first)/size).astype('int'),((yi-first)/size).astype('int')]
plt.imshow(grid.T,origin='lower',interpolation='nearest',cmap='binary')
plt.scatter(((xi-first)/size).astype('int'),((yi-first)/size).astype('int'),c=vflag,cmap='Greens',s=90)
plt.show()
,结果如下:
答案 1 :(得分:7)
您的测试很好,但它只测量某些特定情况: 我们有一个具有许多顶点的多边形,以及用于在多边形内检查它们的长数组点。
而且,我想你不是在测量 matplotlib-inside-polygon-method vs ray-method, 但 matplotlib-somehow-optimized-iteration vs simple-list-iteration
让我们进行N次独立比较(N对点和多边形)?
# ... your code...
lenpoly = 100
polygon = [[np.sin(x)+0.5,np.cos(x)+0.5] for x in np.linspace(0,2*np.pi,lenpoly)[:-1]]
M = 10000
start_time = time()
# Ray tracing
for i in range(M):
x,y = np.random.random(), np.random.random()
inside1 = ray_tracing_method(x,y, polygon)
print "Ray Tracing Elapsed time: " + str(time()-start_time)
# Matplotlib mplPath
start_time = time()
for i in range(M):
x,y = np.random.random(), np.random.random()
inside2 = path.contains_points([[x,y]])
print "Matplotlib contains_points Elapsed time: " + str(time()-start_time)
结果:
Ray Tracing Elapsed time: 0.548588991165
Matplotlib contains_points Elapsed time: 0.103765010834
Matplotlib仍然好得多,但不是100倍。 现在让我们尝试更简单的多边形......
lenpoly = 5
# ... same code
结果:
Ray Tracing Elapsed time: 0.0727779865265
Matplotlib contains_points Elapsed time: 0.105288982391
答案 2 :(得分:6)
如果速度是您所需要的并且额外的依赖性不是问题,您可能会发现numba
非常有用(现在在任何平台上都很容易安装)。您提出的经典ray_tracing
方法可以通过使用numba
装饰器并将多边形转换为numpy数组轻松移植到numba @jit
。代码应如下所示:
@jit(nopython=True)
def ray_tracing(x,y,poly):
n = len(poly)
inside = False
p2x = 0.0
p2y = 0.0
xints = 0.0
p1x,p1y = poly[0]
for i in range(n+1):
p2x,p2y = poly[i % n]
if y > min(p1y,p2y):
if y <= max(p1y,p2y):
if x <= max(p1x,p2x):
if p1y != p2y:
xints = (y-p1y)*(p2x-p1x)/(p2y-p1y)+p1x
if p1x == p2x or x <= xints:
inside = not inside
p1x,p1y = p2x,p2y
return inside
第一次执行将比任何后续调用花费更长的时间:
%%time
polygon=np.array(polygon)
inside1 = [numba_ray_tracing_method(point[0], point[1], polygon) for
point in points]
CPU times: user 129 ms, sys: 4.08 ms, total: 133 ms
Wall time: 132 ms
编译后会减少到:
CPU times: user 18.7 ms, sys: 320 µs, total: 19.1 ms
Wall time: 18.4 ms
如果在第一次调用函数时需要速度,则可以使用pycc
在模块中预编译代码。将函数存储在src.py中,如:
from numba import jit
from numba.pycc import CC
cc = CC('nbspatial')
@cc.export('ray_tracing', 'b1(f8, f8, f8[:,:])')
@jit(nopython=True)
def ray_tracing(x,y,poly):
n = len(poly)
inside = False
p2x = 0.0
p2y = 0.0
xints = 0.0
p1x,p1y = poly[0]
for i in range(n+1):
p2x,p2y = poly[i % n]
if y > min(p1y,p2y):
if y <= max(p1y,p2y):
if x <= max(p1x,p2x):
if p1y != p2y:
xints = (y-p1y)*(p2x-p1x)/(p2y-p1y)+p1x
if p1x == p2x or x <= xints:
inside = not inside
p1x,p1y = p2x,p2y
return inside
if __name__ == "__main__":
cc.compile()
使用python src.py
构建并运行:
import nbspatial
import numpy as np
lenpoly = 100
polygon = [[np.sin(x)+0.5,np.cos(x)+0.5] for x in
np.linspace(0,2*np.pi,lenpoly)[:-1]]
# random points set of points to test
N = 10000
# making a list instead of a generator to help debug
points = zip(np.random.random(N),np.random.random(N))
polygon = np.array(polygon)
%%time
result = [nbspatial.ray_tracing(point[0], point[1], polygon) for point in points]
CPU times: user 20.7 ms, sys: 64 µs, total: 20.8 ms
Wall time: 19.9 ms
在我使用的numba代码中: &#39; b1(f8,f8,f8 [:,]]和#39;
为了使用nopython=True
进行编译,需要在for loop
之前声明每个var。
在prebuild src代码中的行:
@cc.export('ray_tracing', 'b1(f8, f8, f8[:,:])')
用于声明函数名称及其I / O var类型,布尔输出b1
和两个浮点f8
以及浮点f8[:,:]
的二维数组作为输入。
答案 3 :(得分:1)
我找到了其他方法来检查点是否在多边形内 (here)。我只测试了其中的两个(is_inside_sm 和 is_inside_postgis),结果与其他方法相同。
感谢@epifanio,我将代码并行化并与@epifanio 和@user3274748 (ray_tracing_numpy) 方法进行了比较。请注意,这两种方法都有一个错误,所以我按照下面的代码所示修复了它们。
我发现的另一件事是,为创建多边形提供的代码不会生成闭合路径 np.linspace(0,2*np.pi,lenpoly)[:-1]
。因此,上述 GitHub 存储库中提供的代码可能无法正常工作。所以最好创建一个封闭路径(第一个和最后一个点应该相同)。
代码
方法一:parallelpointinpolygon
from numba import jit, njit
import numba
import numpy as np
@jit(nopython=True)
def pointinpolygon(x,y,poly):
n = len(poly)
inside = False
p2x = 0.0
p2y = 0.0
xints = 0.0
p1x,p1y = poly[0]
for i in numba.prange(n+1):
p2x,p2y = poly[i % n]
if y > min(p1y,p2y):
if y <= max(p1y,p2y):
if x <= max(p1x,p2x):
if p1y != p2y:
xints = (y-p1y)*(p2x-p1x)/(p2y-p1y)+p1x
if p1x == p2x or x <= xints:
inside = not inside
p1x,p1y = p2x,p2y
return inside
@njit(parallel=True)
def parallelpointinpolygon(points, polygon):
D = np.empty(len(points), dtype=numba.boolean)
for i in numba.prange(0, len(D)): #<-- Fixed here, must start from zero
D[i] = pointinpolygon(points[i,0], points[i,1], polygon)
return D
方法二:ray_tracing_numpy_numba
@jit(nopython=True)
def ray_tracing_numpy_numba(points,poly):
x,y = points[:,0], points[:,1]
n = len(poly)
inside = np.zeros(len(x),np.bool_)
p2x = 0.0
p2y = 0.0
p1x,p1y = poly[0]
for i in range(n+1):
p2x,p2y = poly[i % n]
idx = np.nonzero((y > min(p1y,p2y)) & (y <= max(p1y,p2y)) & (x <= max(p1x,p2x)))[0]
if len(idx): # <-- Fixed here. If idx is null skip comparisons below.
if p1y != p2y:
xints = (y[idx]-p1y)*(p2x-p1x)/(p2y-p1y)+p1x
if p1x == p2x:
inside[idx] = ~inside[idx]
else:
idxx = idx[x[idx] <= xints]
inside[idxx] = ~inside[idxx]
p1x,p1y = p2x,p2y
return inside
方法 3: Matplotlib contains_points
path = mpltPath.Path(polygon,closed=True) # <-- Very important to mention that the path
# is closed (default is false)
方法 4: is_inside_sm(从 here 获得)
@jit(nopython=True)
def is_inside_sm(polygon, point):
length = len(polygon)-1
dy2 = point[1] - polygon[0][1]
intersections = 0
ii = 0
jj = 1
while ii<length:
dy = dy2
dy2 = point[1] - polygon[jj][1]
# consider only lines which are not completely above/bellow/right from the point
if dy*dy2 <= 0.0 and (point[0] >= polygon[ii][0] or point[0] >= polygon[jj][0]):
# non-horizontal line
if dy<0 or dy2<0:
F = dy*(polygon[jj][0] - polygon[ii][0])/(dy-dy2) + polygon[ii][0]
if point[0] > F: # if line is left from the point - the ray moving towards left, will intersect it
intersections += 1
elif point[0] == F: # point on line
return 2
# point on upper peak (dy2=dx2=0) or horizontal line (dy=dy2=0 and dx*dx2<=0)
elif dy2==0 and (point[0]==polygon[jj][0] or (dy==0 and (point[0]-polygon[ii][0])*(point[0]-polygon[jj][0])<=0)):
return 2
ii = jj
jj += 1
#print 'intersections =', intersections
return intersections & 1
@njit(parallel=True)
def is_inside_sm_parallel(points, polygon):
ln = len(points)
D = np.empty(ln, dtype=numba.boolean)
for i in numba.prange(ln):
D[i] = is_inside_sm(polygon,points[i])
return D
方法 5: is_inside_postgis(从 here 获得)
@jit(nopython=True)
def is_inside_postgis(polygon, point):
length = len(polygon)
intersections = 0
dx2 = point[0] - polygon[0][0]
dy2 = point[1] - polygon[0][1]
ii = 0
jj = 1
while jj<length:
dx = dx2
dy = dy2
dx2 = point[0] - polygon[jj][0]
dy2 = point[1] - polygon[jj][1]
F =(dx-dx2)*dy - dx*(dy-dy2);
if 0.0==F and dx*dx2<=0 and dy*dy2<=0:
return 2;
if (dy>=0 and dy2<0) or (dy2>=0 and dy<0):
if F > 0:
intersections += 1
elif F < 0:
intersections -= 1
ii = jj
jj += 1
#print 'intersections =', intersections
return intersections != 0
@njit(parallel=True)
def is_inside_postgis_parallel(points, polygon):
ln = len(points)
D = np.empty(ln, dtype=numba.boolean)
for i in numba.prange(ln):
D[i] = is_inside_postgis(polygon,points[i])
return D
1000 万积分计时:
parallelpointinpolygon Elapsed time: 4.0122294425964355
Matplotlib contains_points Elapsed time: 14.117807388305664
ray_tracing_numpy_numba Elapsed time: 7.908452272415161
sm_parallel Elapsed time: 0.7710440158843994
is_inside_postgis_parallel Elapsed time: 2.131121873855591
这是代码。
import matplotlib.pyplot as plt
import matplotlib.path as mpltPath
from time import time
import numpy as np
np.random.seed(2)
time_parallelpointinpolygon=[]
time_mpltPath=[]
time_ray_tracing_numpy_numba=[]
time_is_inside_sm_parallel=[]
time_is_inside_postgis_parallel=[]
n_points=[]
for i in range(1, 10000002, 1000000):
n_points.append(i)
lenpoly = 100
polygon = [[np.sin(x)+0.5,np.cos(x)+0.5] for x in np.linspace(0,2*np.pi,lenpoly)]
polygon = np.array(polygon)
N = i
points = np.random.uniform(-1.5, 1.5, size=(N, 2))
#Method 1
start_time = time()
inside1=parallelpointinpolygon(points, polygon)
time_parallelpointinpolygon.append(time()-start_time)
# Method 2
start_time = time()
path = mpltPath.Path(polygon,closed=True)
inside2 = path.contains_points(points)
time_mpltPath.append(time()-start_time)
# Method 3
start_time = time()
inside3=ray_tracing_numpy_numba(points,polygon)
time_ray_tracing_numpy_numba.append(time()-start_time)
# Method 4
start_time = time()
inside4=is_inside_sm_parallel(points,polygon)
time_is_inside_sm_parallel.append(time()-start_time)
# Method 5
start_time = time()
inside5=is_inside_postgis_parallel(points,polygon)
time_is_inside_postgis_parallel.append(time()-start_time)
plt.plot(n_points,time_parallelpointinpolygon,label='parallelpointinpolygon')
plt.plot(n_points,time_mpltPath,label='mpltPath')
plt.plot(n_points,time_ray_tracing_numpy_numba,label='ray_tracing_numpy_numba')
plt.plot(n_points,time_is_inside_sm_parallel,label='is_inside_sm_parallel')
plt.plot(n_points,time_is_inside_postgis_parallel,label='is_inside_postgis_parallel')
plt.xlabel("N points")
plt.ylabel("time (sec)")
plt.legend(loc = 'best')
plt.show()
结论
最快的算法是:
1- is_inside_sm_parallel
2- is_inside_postgis_parallel
3- parallelpointinpolygon (@epifanio)
答案 4 :(得分:0)
我将其保留在此处,仅使用numpy重写了上面的代码,也许有人觉得它有用:
def ray_tracing_numpy(x,y,poly):
n = len(poly)
inside = np.zeros(len(x),np.bool_)
p2x = 0.0
p2y = 0.0
xints = 0.0
p1x,p1y = poly[0]
for i in range(n+1):
p2x,p2y = poly[i % n]
idx = np.nonzero((y > min(p1y,p2y)) & (y <= max(p1y,p2y)) & (x <= max(p1x,p2x)))[0]
if p1y != p2y:
xints = (y[idx]-p1y)*(p2x-p1x)/(p2y-p1y)+p1x
if p1x == p2x:
inside[idx] = ~inside[idx]
else:
idxx = idx[x[idx] <= xints]
inside[idxx] = ~inside[idxx]
p1x,p1y = p2x,p2y
return inside
将ray_trace封装为
def ray_tracing_mult(x,y,poly):
return [ray_tracing(xi, yi, poly[:-1,:]) for xi,yi in zip(x,y)]
经过100000点测试,结果:
ray_tracing_mult 0:00:00.850656
ray_tracing_numpy 0:00:00.003769
答案 5 :(得分:0)
Even-odd rule 的纯 numpy 向量化实现
其他答案要么是一个缓慢的 python 循环,要么需要外部依赖或 cython 处理。
import numpy as np
def points_in_polygon(polygon, pts):
pts = np.asarray(pts,dtype='float32')
polygon = np.asarray(polygon,dtype='float32')
contour2 = np.vstack((polygon[1:], polygon[:1]))
test_diff = contour2-polygon
mask1 = (pts[:,None] == polygon).all(-1).any(-1)
m1 = (polygon[:,1] > pts[:,None,1]) != (contour2[:,1] > pts[:,None,1])
slope = ((pts[:,None,0]-polygon[:,0])*test_diff[:,1])-(test_diff[:,0]*(pts[:,None,1]-polygon[:,1]))
m2 = slope == 0
mask2 = (m1 & m2).any(-1)
m3 = (slope < 0) != (contour2[:,1] < polygon[:,1])
m4 = m1 & m3
count = np.count_nonzero(m4,axis=-1)
mask3 = ~(count%2==0)
mask = mask1 | mask2 | mask3
return mask
N = 1000000
lenpoly = 1000
polygon = [[np.sin(x)+0.5,np.cos(x)+0.5] for x in np.linspace(0,2*np.pi,lenpoly)]
polygon = np.array(polygon,dtype='float32')
points = np.random.uniform(-1.5, 1.5, size=(N, 2)).astype('float32')
mask = points_in_polygon(polygon, points)
大小为 1000 的多边形的 100 万个点需要 44 秒。
它比其他实现慢几个数量级,但仍然比 python 循环快,并且只使用 numpy。