我有一个二维地图,分为矩形网格(其中约45,000个)和许多从shapefile派生的多边形/多面体(我目前使用shapefile库pyshp读取它们)。不幸的是,这些多边形中的一些非常复杂,并且由大量点组成(其中一个点具有640,000个点),并且其中可能有孔。
我要尝试的是针对每个这些多边形检查哪些单元格中心(网格的单元格)落在该特定多边形内。但是,拥有约45,000个像元中心和150个多边形需要花费相当长的时间才能使用形状检查所有内容。这是我正在或多或少要做的事情:
# nx and ny are the number of cells in the x and y direction respectively
# x, y are the coordinates of the cell centers in the grid - numpy arrays
# shapes is a list of shapely shapes - either Polygon or MultiPolygon
# Point is a shapely.geometry.Point class
for j in range(ny):
for i in range(nx):
p = Point([x[i], y[j]])
for s in shapes:
if s.contains(p):
# The shape s contains the point p - remove the cell
根据所涉及的多边形,每次检查要花费200毫秒到80毫秒之间的时间,并且要进行600万次以上的检查,运行时间很容易就花了几个小时。
我想必须有一个更明智的方式来解决这个问题-如果可能的话,我宁愿保持身材,但任何建议都将受到赞赏。
谢谢。
答案 0 :(得分:4)
在接受的答案中,使用 rtrees in shapely 仅支持多边形的扩展 - 或边界框。正如匀称的文档中所述:
https://shapely.readthedocs.io/en/stable/manual.html#str-packed-r-tree
从 1.7.1 开始,rtree.query 需要检查多边形是否实际包含该点。这意味着额外的检查,但如果多边形不是非常混乱的配置,仍然可以加速算法
也就是说,代码应该是这样的
from shapely.strtree import STRtree
from shapely.geometry import Polygon, Point
# this is the grid. It includes a point for each rectangle center.
points = [Point(i, j) for i in range(10) for j in range(10)]
tree = STRtree(points) # create a 'database' of points
# this is one of the polygons.
p = Polygon([[0.5, 0], [2, 0.2], [3, 4], [0.8, 0.5], [0.5, 0]])
res = [o for o in tree.query(p) if p.contains(o)] # run the query (a single shot) - and test if the found points are actually inside the polygon.
# do something with the results
print([o.wkt for o in res])
确实是现在的结果
['POINT (2 1)', 'POINT (2 2)']
-- 奖金编辑 ---
通过在以下意义上定位问题,我观察到了更好的速度改进:
在该区域上制作一个方形网格,STRtree 结构由点和多边形组成。在每个单独的网格多边形上查询点和多边形 - 在每个网格多边形内,检查查询点是否包含在查询多边形内。从而减少对单个多边形的检查量。
from shapely.strtree import STRtree
from shapely.geometry import Polygon, Point
import random
# build a grid spread over the area.
grid = [Polygon([[i, j],[i+1,j],[i,j+1],[i+1,j+1]]) for i in range(10) for j in range(10)]
pointtree = STRtree([Point(random.uniform(1,10),random.uniform(1,10)) for q in range(50)]) # create a 'database' of 50 random points - and build a STRtree structure over it
# take the same polygon as above, but put in an STRtree
polytree = STRtree([Polygon([[0.5, 0], [2, 0.2], [3, 4], [0.8, 0.5], [0.5, 0]])])
res = []
#do nested queries inside the grid
for g in grid:
polyquery = polytree.query(g)
pointquery = pointtree.query(g)
for point in pointquery:
for poly in polyquery:
if poly.contains(point):
res.append(point)
# do something with the results
print([o.wkt for o in res])
答案 1 :(得分:1)
如果您希望执行较少的比较操作,则可以尝试使用匀称的str-tree功能。考虑以下代码:
npm install -g browser-sync
npm ERR! Linux 4.15.0-101-generic
npm ERR! argv "/usr/bin/nodejs" "/usr/bin/npm" "install" "-g" "browser-sync"
npm ERR! node v4.2.6
npm ERR! npm v3.5.2
npm ERR! code ENOTFOUND
npm ERR! errno ENOTFOUND
npm ERR! syscall getaddrinfo
npm ERR! network getaddrinfo ENOTFOUND registry.npmjs.org registry.npmjs.org:443
npm ERR! network This is most likely not a problem with npm itself
npm ERR! network and is related to network connectivity.
npm ERR! network In most cases you are behind a proxy or have bad network settings.
npm ERR! network
npm ERR! network If you are behind a proxy, please make sure that the
npm ERR! network 'proxy' config is set properly. See: 'npm help config'
npm ERR! Please include the following file with any support request:
npm ERR! /home/krishna/npm-debug.log`
此代码的结果是:
from shapely.strtree import STRtree
from shapely.geometry import Polygon, Point
# this is the grid. It includes a point for each rectangle center.
points = [Point(i, j) for i in range(10) for j in range(10)]
tree = STRtree(points). # create a 'database' of points
# this is one of the polygons.
p = Polygon([[0.5, 0], [2, 0.2], [3, 4], [0.8, 0.5], [0.5, 0]])
res = tree.query(p). # run the query (a single shot).
# do something with the results
print([o.wkt for o in res])