为什么这个列不是由MySQL计算的?我怎样才能重写我的查询呢?

时间:2014-11-05 06:12:00

标签: mysql sql database performance

所以,我必须编写一个查询,其中给定一堆带(x, y, z)坐标的点,我需要为给定{{1}列表中的每个点找到最接近的place }}第

place坐标是用户提供的,因此,数据库中不存在,我使用x, y, z作为临时表的一种工作。

SELECT... UNION的列表实际上是我place id places_table的{​​{1}},4030764,4030734,4030752,3948...看起来像 -

places_table

性能 真的 至关重要,因此我尝试计算所有的平方距离(每个点对位置之间)以及选择每个点的最小距离位置一气呵成。

我试图写这个查询,(可能不是一个非常好的尝试......)

CREATE TABLE `places_table` (
    `place_id` int(10) unsigned NOT NULL,
    `latlon_id` int(11) DEFAULT NULL,
    `latitude` double DEFAULT NULL,
    `longitude` double DEFAULT NULL,
    `x` double DEFAULT NULL,
    `y` double DEFAULT NULL,
    `z` double DEFAULT NULL,
)

它会起作用但SELECT point.index, (@px:=point.x) AS point_x, (@py:=point.y) AS point_y, (@pz:=point.z) AS point_z, place.lat_lon_id, place.place_id, place.sq_dist FROM ( SELECT DISTINCT(latlon_id) AS lat_lon_id, place_id, (@sq_dist:=(pow(x-@px, 2) + pow(y-@py, 2) + pow(z-@pz, 2))) AS sq_dist FROM places_table WHERE id IN ( 4030764,4030734,4030752,3948666,4030743,4030751,4030742,4030740,4030757,4030733,4030763,4030748,4030741, 4030735,4030744,4030753,4030737,4030736,4030731,8030076,4030739,6930873,4030727,4030758,4030726,8261466, 4030801,4030756,4030730,4030759,7840188,7911304,4030762,4030728,4030729,6531602,4030755,4030754,4030760, 4030749,4030750,4030761,8224616,4030738,4030732,4030746,4030747,4030745,4030871,4030872,4030790,4030787, 3948662,4030797,4030791,4030775,4030794,4030772,4030796,4030798,3948648,4030792,4030789,4030773,4030799, 3948661,3948651,4030788,4030778,3948657,4030800,4030795,4030793,4020117,4020363,3948663,4030777,3948658, 3948650,4030776,4020292,4020210,3948649,4030717,3969465,3969459,4030779,4030704,4030694,4030713,8529197, 4030873,3733656,3948664,4030786,4030781,4030783 ) ORDER BY sq_dist LIMIT 1 ) AS place, ( SELECT 0 AS index, 1636407.74908 AS x, -2220902.79092 AS y, -5744766.34094 AS z UNION SELECT 1, 1674317.79921, -2157598.66673, -5757951.69661 UNION SELECT 2, 1652089.75753, -2193845.00579, -5750671.55762 UNION SELECT 3, 1621803.74283, -2184916.54092, -5762679.25265 UNION SELECT 4, 1615277.72619, -2200110.86847, -5758729.88373 UNION SELECT 5, 1652642.77785, -2208303.65375, -5744975.77555 UNION SELECT 6, 1618985.40684, -2190362.00049, -5761404.37734 UNION SELECT 7, 1621151.08717, -2208242.04656, -5753965.24636 UNION SELECT 8, 1663760.68219, -2166853.74959, -5757536.37073 UNION SELECT 9, 1639392.0856, -2136418.33191, -5775871.37502 ) AS point ORDER BY point.index; 的价值由于某种原因无法计算,因此对它进行排序不会起作用。它给出了以下结果 -

@sq_dist

我一直在用力撞击这一整夜,我尝试使用+-------+---------------+----------------+----------------+------------+------------+---------+ | index | point_x | point_y | point_z | lat_lon_id | place_id | sq_dist | +-------+---------------+----------------+----------------+------------+------------+---------+ | 0 | 1636407.74908 | -2220902.79092 | -5744766.34094 | 433534 | 8529197 | NULL | | 1 | 1674317.79921 | -2157598.66673 | -5757951.69661 | 433534 | 8529197 | NULL | | 2 | 1652089.75753 | -2193845.00579 | -5750671.55762 | 433534 | 8529197 | NULL | | 3 | 1621803.74283 | -2184916.54092 | -5762679.25265 | 433534 | 8529197 | NULL | | 4 | 1615277.72619 | -2200110.86847 | -5758729.88373 | 433534 | 8529197 | NULL | | 5 | 1652642.77785 | -2208303.65375 | -5744975.77555 | 433534 | 8529197 | NULL | | 6 | 1618985.40684 | -2190362.00049 | -5761404.37734 | 433534 | 8529197 | NULL | | 7 | 1621151.08717 | -2208242.04656 | -5753965.24636 | 433534 | 8529197 | NULL | | 8 | 1663760.68219 | -2166853.74959 | -5757536.37073 | 433534 | 8529197 | NULL | | 9 | 1639392.08560 | -2136418.33191 | -5775871.37502 | 433534 | 8529197 | NULL | +-------+---------------+----------------+----------------+------------+------------+---------+ 10 rows in set (0.10 sec) ,但我一直很难理解它的神秘方式。 应该使用GROUP BY只获得每个点的最小距离,但我似乎无法弄明白。

其他人可以想到一种让这项工作的方法吗?

任何帮助都将受到高度赞赏。

谢谢,

1 个答案:

答案 0 :(得分:0)

所以,我使用GROUP BY自己想出来了。这是修改后的查询 -

SELECT t.latlon_id, t.place_id, t.latitude, t.longitude, t.indx, (@dist:=(pow(t.sq_dist, 0.5)/1000)) as dist from (
    SELECT place.latlon_id, place.place_id, place.latitude, place.longitude, place.x, place.y, place.z, point.indx, point.xx, point.yy, point.zz,
        (@sq_dist:=(pow(point.xx-place.x, 2) + pow(point.yy-place.y, 2) + pow(point.zz-place.z, 2))) AS sq_dist
    FROM (
            SELECT 0 AS indx, 1636407.74908 AS xx, -2220902.79092 AS yy, -5744766.34094 AS zz
      UNION SELECT 1,          1674317.79921,      -2157598.66673,      -5757951.69661
      UNION SELECT 2,          1652089.75753,      -2193845.00579,      -5750671.55762
      UNION SELECT 3,          1621803.74283,      -2184916.54092,      -5762679.25265
      UNION SELECT 4,          1615277.72619,      -2200110.86847,      -5758729.88373
      UNION SELECT 5,          1652642.77785,      -2208303.65375,      -5744975.77555
      UNION SELECT 6,          1618985.40684,      -2190362.00049,      -5761404.37734
      UNION SELECT 7,          1621151.08717,      -2208242.04656,      -5753965.24636
      UNION SELECT 8,          1663760.68219,      -2166853.74959,      -5757536.37073
      UNION SELECT 9,          1639392.0856,       -2136418.33191,      -5775871.37502
    ) as point
    JOIN geonames_geonamelookup place
    WHERE place.geoname_id IN (
        4030764,4030734,4030752,3948666,4030743,4030751,4030742,4030740,4030757,4030733,4030763,4030748,4030741,
        4030735,4030744,4030753,4030737,4030736,4030731,8030076,4030739,6930873,4030727,4030758,4030726,8261466,
        4030801,4030756,4030730,4030759,7840188,7911304,4030762,4030728,4030729,6531602,4030755,4030754,4030760,
        4030749,4030750,4030761,8224616,4030738,4030732,4030746,4030747,4030745,4030871,4030872,4030790,4030787,
        3948662,4030797,4030791,4030775,4030794,4030772,4030796,4030798,3948648,4030792,4030789,4030773,4030799,
        3948661,3948651,4030788,4030778,3948657,4030800,4030795,4030793,4020117,4020363,3948663,4030777,3948658,
        3948650,4030776,4020292,4020210,3948649,4030717,3969465,3969459,4030779,4030704,4030694,4030713,8529197,
        4030873,3733656,3948664,4030786,4030781,4030783
    ) ORDER BY sq_dist
) as t GROUP BY t.indx;

MySQL GROUP BY根本不直观......

编辑: 请注意,以上查询速度非常慢且根本无法扩展,目前,我正在从python代码中执行的操作似乎运行良好且快速是,

count = len(xyz_s)
dist_phrase = ', (pow(x-%s, 2) + pow(y-%s, 2) + pow(z-%s, 2))' * count
query = """
        select place_id, latlon_id, latitude, longitude {dist_phrase} from place_table where place_id in ({nbrs})
""".format(dist_phrase=dist_phrase, nbrs=nbrs)  % tuple(a for xyz in xyz_s for a in xyz)
cursor.execute(query)

现在这给我的行数等于要查找的位数,以及包含每个点的平方距离的其他列,我进一步确定了python本身中每个点的最近位置。

希望这有助于某人。