如何加快EDIT_DISTANCE和插入查询?

时间:2015-09-22 01:15:54

标签: oracle performance edit-distance

---------------
MASTER TABLE
---------------
DATA_KEY NUMBER
TEXT VARCHAR2(2000)
ORDER_NO NUMBER

---------------
DETAIL TABLE
---------------
DATA_KEY NUMBER
SIMILAR_DATA_KEY NUMBER
DISTANCE_COUNT NUMBER

---------------
INSERT QUERY
---------------
INSERT INTO DETAIL
(
  SELECT DATA_KEY, SIMILAR_DATA_KEY, DISTANCE_COUNT
  FROM
  (
  SELECT A.DATA_KEY AS DATA_KEY, B.DATA_KEY AS SIMILAR_DATA_KEY, 
    UTL_MATCH.EDIT_DISTANCE(A.TEXT, B.TEXT) AS DISTANCE_COUNT
  FROM 
    (SELECT DATA_KEY, TEXT, ORDER_NO FROM MASTER) A
    INNER JOIN
    (SELECT DATA_KEY, TEXT, ORDER_NO FROM MASTER) B
    ON (A.ORDER_NO < B.ORDER_NO)
  )
  WHERE DISTANCE_COUNT <= 5
)

我需要将MASTER表TEXT字段与其他TEXT字段进行比较。

索引不存在。 主表90,000行。

ORDER_NO field is for avoid duplicated compare. (1 .. 90000)
=============================================================
A.ORDER_NO < B.ORDER_NO
------------------------------------------
1, 1 <- exclude
1, 2 <- join
1, 3 <- join
1, 4 <- join
..
2, 1 <- exclude
2, 2 <- exclude
2, 3 <- join
2, 4 <- join
...
3, 1 <- exclude
3, 2 <- exclude
3, 3 <- exclude
3, 4 <- join

1. NOT need compare 1 and 1
2. Need compare 1 and 2
3. NOT need compare 2 and 1 (because, duplicate 2.)

so, for decrease compare count...
=============================================================

慢区是(WHERE DISTANCE_COUNT <= 5)?

慢区是比较行(90000*89999/2)?

查询过程时间为7天。

将6,000行插入DETAIL表。

如何加速?

我很抱歉英语不好......

0 个答案:

没有答案