---------------
MASTER TABLE
---------------
DATA_KEY NUMBER
TEXT VARCHAR2(2000)
ORDER_NO NUMBER
---------------
DETAIL TABLE
---------------
DATA_KEY NUMBER
SIMILAR_DATA_KEY NUMBER
DISTANCE_COUNT NUMBER
---------------
INSERT QUERY
---------------
INSERT INTO DETAIL
(
SELECT DATA_KEY, SIMILAR_DATA_KEY, DISTANCE_COUNT
FROM
(
SELECT A.DATA_KEY AS DATA_KEY, B.DATA_KEY AS SIMILAR_DATA_KEY,
UTL_MATCH.EDIT_DISTANCE(A.TEXT, B.TEXT) AS DISTANCE_COUNT
FROM
(SELECT DATA_KEY, TEXT, ORDER_NO FROM MASTER) A
INNER JOIN
(SELECT DATA_KEY, TEXT, ORDER_NO FROM MASTER) B
ON (A.ORDER_NO < B.ORDER_NO)
)
WHERE DISTANCE_COUNT <= 5
)
我需要将MASTER表TEXT字段与其他TEXT字段进行比较。
索引不存在。 主表90,000行。
ORDER_NO field is for avoid duplicated compare. (1 .. 90000)
=============================================================
A.ORDER_NO < B.ORDER_NO
------------------------------------------
1, 1 <- exclude
1, 2 <- join
1, 3 <- join
1, 4 <- join
..
2, 1 <- exclude
2, 2 <- exclude
2, 3 <- join
2, 4 <- join
...
3, 1 <- exclude
3, 2 <- exclude
3, 3 <- exclude
3, 4 <- join
1. NOT need compare 1 and 1
2. Need compare 1 and 2
3. NOT need compare 2 and 1 (because, duplicate 2.)
so, for decrease compare count...
=============================================================
慢区是(WHERE DISTANCE_COUNT <= 5
)?
慢区是比较行(90000*89999/2
)?
查询过程时间为7天。
将6,000行插入DETAIL表。
如何加速?
我很抱歉英语不好......