我处于需要收集100平方米距离内某些纬度数据点的数据的情况。我目前正在运行如下查询,该查询适用于较少的位置。下面针对15个数据点的查询大约需要10分钟才能运行,但这种方法无法通过更多数据点进行扩展。我使用4000个lat lon数据点运行了类似的查询(与美国地图上的4000个位置有关),查询需要30个小时才能运行。我知道哪个语句逐行扫描整个表,这就是为什么查询运行真的很长。即使我选择较少的必需列,查询也需要很长时间才能运行。你们中的任何一个人都有更好的方法来实现这一目标。请指教。
create table crt1
as
select * from masterdata
where
(round(device_lat,4) >= 33.7306 and round(device_lat , 4) <= 33.7316 and round(device_lon,4) >= -117.8364 and round(device_lon , 4) <= -117.8354) or
(round(device_lat,4) >= 37.927 and round(device_lat , 4) <= 37.928 and round(device_lon,4) >= -122.517 and round(device_lon , 4) <= -122.516) or
(round(device_lat,4) >= 30.2711 and round(device_lat , 4) <= 30.2721 and round(device_lon,4) >= -97.7544 and round(device_lon , 4) <= -97.7534) or
(round(device_lat,4) >= 33.0673 and round(device_lat , 4) <= 33.0683 and round(device_lon,4) >= -117.2642 and round(device_lon , 4) <= -117.2632) or
(round(device_lat,4) >= 34.8271 and round(device_lat , 4) <= 34.8281 and round(device_lon,4) >= -82.3011 and round(device_lon , 4) <= -82.3001) or
(round(device_lat,4) >= 32.9258 and round(device_lat , 4) <= 32.9268 and round(device_lon,4) >= -96.8311 and round(device_lon , 4) <= -96.8301) or
(round(device_lat,4) >= 45.0917 and round(device_lat , 4) <= 45.0927 and round(device_lon,4) >= -93.4272 and round(device_lon , 4) <= -93.4262) or
(round(device_lat,4) >= 36.0214 and round(device_lat , 4) <= 36.0224 and round(device_lon,4) >= -115.0853 and round(device_lon , 4) <= -115.0843) or
(round(device_lat,4) >= 47.2156 and round(device_lat , 4) <= 47.2166 and round(device_lon,4) >= -122.2351 and round(device_lon , 4) <= -122.2341) or
(round(device_lat,4) >= 32.2492 and round(device_lat , 4) <= 32.2502 and round(device_lon,4) >= -110.8845 and round(device_lon , 4) <= -110.8835) or
(round(device_lat,4) >= 32.286 and round(device_lat , 4) <= 32.287 and round(device_lon,4) >= -110.9753 and round(device_lon , 4) <= -110.9743) or
(round(device_lat,4) >= 36.8477 and round(device_lat , 4) <= 36.8487 and round(device_lon,4) >= -119.7911 and round(device_lon , 4) <= -119.7901) or
(round(device_lat,4) >= 36.0842 and round(device_lat , 4) <= 36.0852 and round(device_lon,4) >= -79.8363 and round(device_lon , 4) <= -79.8353) or
(round(device_lat,4) >= 39.0612 and round(device_lat , 4) <= 39.0622 and round(device_lon,4) >= -77.1245 and round(device_lon , 4) <= -77.1235) or
(round(device_lat,4) >= 32.8389 and round(device_lat , 4) <= 32.8399 and round(device_lon,4) >= -117.1629 and round(device_lon , 4) <= -117.1619) or
(round(device_lat,4) >= 61.1948 and round(device_lat , 4) <= 61.1958 and round(device_lon,4) >= -149.9061 and round(device_lon , 4) <= -149.9051);
答案 0 :(得分:1)
首先:在masterdata(device_lat)
上创建一个索引,在masterdata(device_lon)
其次,将此查询的每一行重新命名为:
(device_lat >= 32.8389 and device_lat <= 32.8399 and
device_lon >= -117.1629 and device_lon <= -117.1619) or ...
您对round(lat,4)
的使用已经破坏了您使用索引进行搜索的能力,这使得它确实非常缓慢:它必须多次扫描您的表。
如果您正在处理GPS数据,或者您恰好使用球面地球近似距离,round()
函数无法获得任何精确度。全局位置的实际精度大约是小数点后四位,精度的更多位数既不会有助于也不会损害您的准确性。
如果您理解Universal Transverse Mercator Projection或Lambert Projection这两个术语,那么您实际上比您的问题所表明的更了解数据的精确度,您应该使用这些知识。
实际上,说实话,你应该这样重写:
SELECT m.*
FROM masterdata AS m
JOIN (
SELECT radius AS 0.0005
) AS radius
JOIN ( /* make a virtual table of your bunch of centerpoints */
SELECT 33.7311 AS lat, -117.8359 AS long
UNION ALL
SELECT 37.9275, -122.5165
UNION ALL
SELECT somelat, somelon
UNION ALL ...
) AS points
ON m.device_lat >= points.lat - radius
AND m.device_lat <= points.lat + radius
AND m.device_long >= points.long - (radius / COS(RADIANS(points.lat)))
AND m.device_long <= points.long + (radius / COS(RADIANS(points.lat)))
这将为您提供尽可能高效的结果。它将调整经度搜索的radius
值,以校正经度线远离赤道的距离。它让MySQL优化。
修改强>
我刚注意到你的100平方米的要求,我将其解释为地面+/- 50米的边界框。 (你在这里接近准确度限制。)
纬度为111045米,因此您需要半径值(50.0 / 111045.0),恰好约为0.0004503。您在问题中显示的值,0.0005更像是一个111平方米。
这里有一些背景知识。 http://www.plumislandmedia.net/mysql/haversine-mysql-nearest-loc/