我有一个问题,我必须使用(或者至少是我发现的唯一解决方案)regexp_like在一个大型数据库(现在28k行,可能会增加到100k)但是使用这个功能需要花费太长时间。
这是我想要实现的例子:
TABLE_1:
**NUMBER** **ANSWER**
100 Answer 1
1100, 1099 Answer 2
99 Answer 3
1099 Answer 4
TABLE_2:
**NUMBER**
100
1100
1099
99
我想创建一个View,它比较表1和2“NUMBER”列,并根据TABLE_2“NUMBER”返回“ANSWER”。它会返回这样的东西:
查看:
**NUMBER** **ANSWER**
100 Answer 1
1100 Answer 2
1099 Multiple Answer
99 Answer 3
这就是我现在正在做的事情,而且它将永远存在:
SELECT A.*, (CASE
WHEN (select count(distinct B.ANSWER) from TABLE_2 B WHERE regexp_like(B.NUMBER,'(^|\s|,)'||A.NUMBER||'(\s|$|,)'))> 1
THEN 'Multiple Answer'
ELSE (select count(distinct B.ANSWER) from TABLE_2 B WHERE regexp_like(B.NUMBER,'(^|\s|,)'||A.NUMBER||'(\s|$|,)'))
END) FINAL_ANSWER
FROM Table_1 A
任何人都可以帮我吗?
答案 0 :(得分:1)
如果您不得不使用您所拥有的令人不快的数据模型,这里有一个REGEXP_LIKE的替代方案,您可以尝试查看它是否表现更好:
select num
, case when num_answers > 1 then 'multiple answer'
else answer
end final_answer
from (
select a.*
, (select to_char(count(distinct b.answer))
from table_1 b
where ','||replace(b.num,' ','')||',' like '%,'||a.num||',%'
) num_answers
from table_2 a
);
replace
是删除任何空格,然后我们检查例如',1100,1099,'喜欢'%,1099,%' - 在两端添加逗号。
使用嵌套查询可以避免两次执行所有计数 - 您也可以在REGEXP_LIKE版本中使用此方法。
注意:
答案 1 :(得分:1)
SELECT B."NUMBER",
CASE MAX( A.ANSWER )
WHEN MIN( A.ANSWER )
THEN MAX( A.ANSWER )
ELSE 'Multiple Answers'
END AS Answer
FROM TABLE_2 B
INNER JOIN
TABLE_1 A
ON( REGEXP_LIKE( A."NUMBER", '(^|\D)' || B."NUMBER" || '(\D|$)' ) )
GROUP BY B."NUMBER";
或者,正如@TonyAndrews建议您可以使用字符串比较 - LIKE
或:
FROM TABLE_2 B
INNER JOIN
TABLE_1 A
ON( INSTR( ','||REPLACE( A."NUMBER", ' ' )||',', ','||B."NUMBER"||',' ) > 0 )
通过将','||REPLACE( A."NUMBER", ' ' )||','
和','||B."NUMBER"||','
编入索引,您可以更快地加快速度:
CREATE TABLE TABLE_1 ( "NUMBER", ANSWER ) AS
SELECT '100', 'Answer 1' FROM DUAL UNION ALL
SELECT '1100, 1099', 'Answer 2' FROM DUAL UNION ALL
SELECT '99', 'Answer 3' FROM DUAL UNION ALL
SELECT '1099', 'Answer 4' FROM DUAL;
CREATE TABLE TABLE_2 ( "NUMBER" ) AS
SELECT 100 FROM DUAL UNION ALL
SELECT 1099 FROM DUAL UNION ALL
SELECT 1100 FROM DUAL UNION ALL
SELECT 99 FROM DUAL;
CREATE INDEX T1_WITH_NO_SPACES__IDX ON TABLE_1 (
','||REPLACE("NUMBER",' ')||',',
ANSWER
);
CREATE INDEX T2_LIKE_WITH_COMMA__IDX ON TABLE_2 (
'%,'||"NUMBER"||',%',
"NUMBER"
);
SELECT B."NUMBER",
CASE MAX( A.ANSWER )
WHEN MIN( A.ANSWER )
THEN MAX( A.ANSWER )
ELSE 'Multiple Answers'
END AS Answer
FROM ( SELECT '%,'||"NUMBER"||',%' AS like_number,
"NUMBER"
FROM TABLE_2
) B
INNER JOIN
( SELECT ','||REPLACE("NUMBER",' ')||',' AS numbers,
ANSWER
FROM TABLE_1
) A
ON( numbers LIKE like_number )
GROUP BY B."NUMBER";
<强>输出强>:
NUMBER ANSWER
---------- ----------------
100 Answer 1
1099 Multiple Answers
1100 Answer 2
99 Answer 3
解释计划:
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8 | 440 | 5 (20)| 00:00:01 |
| 1 | HASH GROUP BY | | 8 | 440 | 5 (20)| 00:00:01 |
| 2 | NESTED LOOPS | | 8 | 440 | 4 (0)| 00:00:01 |
| 3 | INDEX FULL SCAN | T2_LIKE_WITH_COMMA__IDX | 4 | 148 | 1 (0)| 00:00:01 |
|* 4 | INDEX FAST FULL SCAN| T1_WITH_NO_SPACES__IDX | 2 | 36 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - filter(','||REPLACE("NUMBER",' ')||',' LIKE '%,'||TO_CHAR("NUMBER")||',%')
Note
-----
- dynamic sampling used for this statement (level=2)