在此示例中,我有一个表,其中包含人员列表,组类别和每个人员的位置(长/纬度坐标)。一个人可以分为多个组。这是一个示例表:
Person Group Long Lat
1 1 11 23
2 1 12 24
. . . .
. . . .
. . . .
2 2 12 24
我还有另一个表,其中列出了企业,它们的位置以及与第一个表中的分组匹配的共享组。同样,企业可以分为多个组。表格示例:
Busns Group Long Lat
5 1 5 6
6 1 6 7
. . . .
. . . .
. . . .
5 2 5 6
我想按人员和按组,使企业之间的距离最小。事实证明,这是一项非常消耗内存的任务。目前,我通过RIGHT JOIN
创建了一个巨大的表格,该表格然后针对每个组来衡量人与企业之间的距离。然后创建另一个,为每个组中的每个人查找最小距离,然后执行INNER JOIN
以便将原始表配对。示例代码:
DROP TABLE IF EXISTS DistancePairs;
CREATE LOCAL TEMPORARY TABLE DistancePairs ON COMMIT PRESERVE ROWS AS (
SELECT a.Person
,a.Group
,b.Business
,a.Latitude AS PersonLat
,a.Longitude AS PersonLong
,b.Latitude AS BusinessLat
,b.Longitude AS BusinessLong
,0.621371*DISTANCEV(a.Latitude,a.Longitude,b.Latitude,b.Longitude) AS AproxDistance
FROM people a
RIGHT JOIN business b
ON a.Group = b.Group
);
DROP TABLE IF EXISTS MinDist;
CREATE LOCAL TEMPORARY TABLE MinDist ON COMMIT PRESERVE ROWS AS (
SELECT DISTINCT
Person
,Group
,MIN(AproxDistance) AS AproxDistance
FROM Distance Pairs
);
SELECT a.Person
,a.Group
,a.Business
,a.AproxDistance
FROM DistancePairs a
JOIN MindDist b
ON a.Person = b.Person
AND a.Group = b.Group
AND a.AproxDistance = b.AproxDistance
;
有更好的方法吗?给定我正在使用的数据集的大小,这将非常糟糕,并且运行数小时。原始的Person和Business表已经使用WHERE语句创建,以限制其大小。
答案 0 :(得分:1)
您可以尝试在查询中加入一个联接,然后再加上一个LIMIT子句吗?
我只有一点点示例数据,因此我无法真正对其意义或废话进行测试。但是这里:
WITH
-- this is your input data ...
persons ( Person, grp, Long, Lat ) AS (
SELECT 1 , 1 , 11 , 23
UNION ALL SELECT 2 , 1 , 12 , 24
UNION ALL SELECT 2 , 2 , 12 , 24
)
,
-- and this, is also your input data ....
businesses (Busns, grp, Long, Lat) AS (
SELECT 5 , 1 , 5 , 6
UNION ALL SELECT 6 , 1 , 6 , 7
UNION ALL SELECT 5 , 2 , 5 , 6
)
,
-- real WITH clause would start here ....
join_and_calc AS (
SELECT
person
, p.grp
, busns
, p.lat
, p.long
, b.lat
, b.long
, 0.621371 * DISTANCEV(p.lat,p.long,b.lat,b.long) AS app_dist
FROM persons p
JOIN businesses b USING(grp)
)
SELECT
*
FROM join_and_calc
LIMIT 1 OVER(PARTITION BY person,grp,busns ORDER BY app_dist)
;
我得到的结果是:
person | grp | busns | lat | long | lat | long | app_dist
--------+-----+-------+-----+------+-----+------+------------------
1 | 1 | 5 | 23 | 11 | 6 | 5 | 1235.42458453758
1 | 1 | 6 | 23 | 11 | 7 | 6 | 1149.36524763703
2 | 1 | 5 | 24 | 12 | 6 | 5 | 1322.28298287477
2 | 1 | 6 | 24 | 12 | 7 | 6 | 1234.90557929051
2 | 2 | 5 | 24 | 12 | 6 | 5 | 1322.28298287477
祝你好运- 马可