我正在研究一个搜索系统,该系统应该检测路线的起点和终点是否开启(接近50公里)。我有很多路径存储在mysql DB中作为点[300k行]。
Structure
id [primary] | id_route | id_point | lat_lng_point (spatial index)
1 1 1 [GEOMETRY - 25 B]
2 1 2 [GEOMETRY - 25 B]
3 1 3 [GEOMETRY - 25 B]
4 1 4 [GEOMETRY - 25 B]
5 2 1 [GEOMETRY - 25 B]
6 2 2 [GEOMETRY - 25 B]
... ... ... ...
问题是如何最有效地选择只有(或接近50公里)起点和终点的路线(route_id)?
我已尝试使用union [in example](或内部联接),但查询需要 0.4s ,这太过分了。 知道如何优化?
SELECT * FROM
(
(
SELECT DISTINCT(id_route)
FROM route_path2
WHERE ST_Contains( ST_MakeEnvelope(
Point(($lng_start+(50/111)), ($lat_start+(50/111))),
Point(($lng_start-(50/111)), ($lat_start-(50/111)))
), route_path2.lat_lng_point )
)
UNION ALL
(
SELECT DISTINCT(id_route)
FROM route_path2
WHERE ST_Contains( ST_MakeEnvelope(
Point(($lng_end+(50/111)), ($lat_end+(50/111))),
Point(($lng_end-(50/111)), ($lat_end-(50/111)))
), route_path2.lat_lng_point )
)
) AS t GROUP BY id_route HAVING count(*) >= 2
修改
我根据@Djeramon建议进行优化现在 0.06s 我不知道这是我能做到的最好的,如果我有50M行怎么办?)
CREATE TEMPORARY TABLE starts_on_route AS
SELECT DISTINCT id_route
FROM route_path2
WHERE ST_Contains( ST_MakeEnvelope(
Point((17.1077+(50/111)), (48.1486+(50/111))),
Point((17.1077-(50/111)), (48.1486-(50/111)))
), route_path2.lat_lng_point );
CREATE INDEX starts_on_route_inx ON starts_on_route(id_route);
SELECT DISTINCT route_path2.id_route
FROM route_path2
LEFT JOIN starts_on_route
ON route_path2.id_route = starts_on_route.id_route
WHERE ST_Contains( ST_MakeEnvelope(
Point((18.7408+(50/111)), (49.2194+(50/111))),
Point((18.7408-(50/111)), (49.2194-(50/111)))
), lat_lng_point )
AND route_path2.id_route = starts_on_route.id_route;
答案 0 :(得分:0)
目前,您在整个路由表上运行了两次查询。尝试运行第一个子查询以确定具有有效起点的所有路由,并仅在这些相关路由上运行第二个子查询。这应该可以保证大约50%的处理时间。
一种方法是使用临时表来存储第一个查询的结果。但是,您需要注意创建的开销,为它创建索引可能是个好主意。有关更多详细信息,请参阅http://blog.endpoint.com/2015/02/temporary-tables-in-sql-query.html