我有一个包含目的地和LAT / LON数据的数据表(~100K记录)
DESTINATIONS {
id,
lat,
lon,
...
}
现在我需要将距离插入新表...
DISTANCES {
id_a,
id_b,
distance
}
最好的方法是什么?
这是计算(以公里/米为单位):
ROUND(111045
* DEGREES(ACOS(COS(RADIANS(A.lat))
* COS(RADIANS(B.lat))
* COS(RADIANS(A.lon) - RADIANS(B.lon))
+ SIN(RADIANS(A.lat))
* SIN(RADIANS(B.lat)))),0)
AS 'distance'
好的,JOIN没问题,但我怎样才能实现三个"过滤器"?
也许使用 WHILE循环和 SUBSELECT LIMIT / TOP 100 ORDER BY距离ASC ?
或者也可以通过JOIN INSERT?
有人有想法吗?
答案 0 :(得分:1)
伪码:
INSERT INTO [newTable] (ColumnList...)
SELECT TOP 100 a.id, b.id, DistanceFormula(a.id, b.id)
FROM Destination a
CROSS JOIN Destination b
WHERE a.id<b.id
ORDER BY DistanceFormula(a.id, b.id) ASC
编辑为每个a获得100 b:
INSERT INTO [newTable] (ColumnList...)
SELECT a.id, b.id, DistanceFormula(a.id, b.id)
FROM Destination a
INNER JOIN Destination b
ON b.id=(
SELECT TOP 100 c.id
FROM Destination c
WHERE a.id<c.id
ORDER BY DistanceFormula(a.id, c.id) ASC
)
答案 1 :(得分:0)
我简化了它(distcalc)......
INSERT INTO [DISTANCES] (id_a, id_b, distance)
SELECT
A.id,
B.id,
25 /*ROUND(111045 * DEGREES(ACOS(COS(RADIANS(A.geo_lat)) * COS(RADIANS(B.geo_lat)) * COS(RADIANS(A.geo_lon) - RADIANS(B.geo_lon)) + SIN(RADIANS(A.geo_lat)) * SIN(RADIANS(B.geo_lat)))),0)*/
FROM [DESTINATIONS] AS A
INNER JOIN [DESTINATIONS] AS B
ON b.id IN(
SELECT TOP 100
C.id
FROM [DESTINATIONS] AS C
WHERE
A.id < C.id
ORDER BY A.id /*ROUND(111045 * DEGREES(ACOS(COS(RADIANS(A.geo_lat)) * COS(RADIANS(C.geo_lat)) * COS(RADIANS(A.geo_lon) - RADIANS(C.geo_lon)) + SIN(RADIANS(A.geo_lat)) * SIN(RADIANS(C.geo_lat)))),0)*/ ASC
)
你的意思是这样吗?
答案 2 :(得分:0)
好。这样可行。 :)
但它肯定太慢了!
我将编写一个例程,根据请求只返回最近的100个结果。 另一个(子)例程将这些(程序侧)结果与时间戳插入/更新到距离表中,以便下次调用时可以访问任何现有结果。
但非常感谢你! :)