我有一个带有字符串的表,其中包含几个分隔值,例如: a;b;c
。
我需要拆分此字符串并在查询中使用其值。例如,我有以下表格:
str
a;b;c
b;c;d
a;c;d
我需要按str
列中的单个值进行分组,以获得以下结果:
str count(*)
a 1
b 2
c 3
d 2
是否可以使用单选查询实现?我无法创建临时表来提取那里的值并查询该临时表。
答案 0 :(得分:4)
从评论到@PrzemyslawKruglej answer
主要问题是使用
connect by
的内部查询,它会生成惊人数量的行
可以使用以下方法减少生成的行数:
/* test table populated with sample data from your question */
SQL> create table t1(str) as(
2 select 'a;b;c' from dual union all
3 select 'b;c;d' from dual union all
4 select 'a;c;d' from dual
5 );
Table created
-- number of rows generated will solely depend on the most longest
-- string.
-- If (say) the longest string contains 3 words (wont count separator `;`)
-- and we have 100 rows in our table, then we will end up with 300 rows
-- for further processing , no more.
with occurrence(ocr) as(
select level
from ( select max(regexp_count(str, '[^;]+')) as mx_t
from t1 ) t
connect by level <= mx_t
)
select count(regexp_substr(t1.str, '[^;]+', 1, o.ocr)) as generated_for_3_rows
from t1
cross join occurrence o;
结果:对于最长的一行由三个单词组成的三行,我们将生成9行:
GENERATED_FOR_3_ROWS
--------------------
9
最终查询:
with occurrence(ocr) as(
select level
from ( select max(regexp_count(str, '[^;]+')) as mx_t
from t1 ) t
connect by level <= mx_t
)
select res
, count(res) as cnt
from (select regexp_substr(t1.str, '[^;]+', 1, o.ocr) as res
from t1
cross join occurrence o)
where res is not null
group by res
order by res;
结果:
RES CNT
----- ----------
a 2
b 2
c 3
d 2
详细了解regexp_count()(11g及以上)和regexp_substr()正则表达式函数。
注意:正则表达式函数的计算成本相对较高,而且在处理大量数据时,可能需要考虑切换到普通的PL / SQL。 Here is an example。
答案 1 :(得分:1)
这很难看,但似乎有效。 CONNECT BY
拆分的问题是它返回重复的行。我设法摆脱它们,但你必须测试它:
WITH
data AS (
SELECT 'a;b;c' AS val FROM dual
UNION ALL SELECT 'b;c;d' AS val FROM dual
UNION ALL SELECT 'a;c;d' AS val FROM dual
)
SELECT token, COUNT(1)
FROM (
SELECT DISTINCT token, lvl, val, p_val
FROM (
SELECT
regexp_substr(val, '[^;]+', 1, level) AS token,
level AS lvl,
val,
NVL(prior val, val) p_val
FROM data
CONNECT BY regexp_substr(val, '[^;]+', 1, level) IS NOT NULL
)
WHERE val = p_val
)
GROUP BY token;
TOKEN COUNT(1) -------------------- ---------- d 2 b 2 a 2 c 3
答案 2 :(得分:0)
SELECT NAME,COUNT(NAME) FROM ( SELECT NAME FROM ( (SELECT rownum as ID, REGEXP_SUBSTR('a;b;c', '[^;]+', 1, LEVEL ) NAME
FROM dual CONNECT BY REGEXP_SUBSTR('a;b;c', '[^;]+', 1, LEVEL) IS NOT NULL))
UNION ALL (SELECT NAME FROM ( (SELECT rownum as ID, REGEXP_SUBSTR('b;c;d', '[^;]+', 1, LEVEL ) NAME
FROM dual CONNECT BY REGEXP_SUBSTR('b;c;d', '[^;]+', 1, LEVEL) IS NOT NULL)))
UNION ALL
(SELECT NAME FROM ( (SELECT rownum as ID, REGEXP_SUBSTR('a;c;d', '[^;]+', 1, LEVEL ) NAME
FROM dual CONNECT BY REGEXP_SUBSTR('a;c;d', '[^;]+', 1, LEVEL) IS NOT NULL)))) GROUP BY NAME
NAME COUNT(NAME)
----- -----------
d 2
a 2
b 2
c 3