我有一列"DESCRIPTION" (VARCHAR2 (500 Byte))
结果我想要两列。首先从每个单元格中提取唯一词,然后将其显示在一列中,然后计算它们的出现频率。
此外,我有限制参数"ENTRYDATE" (i.e. "WHERE ENTRYDATE BETWEEN 20180101 and 20190101").
,因为表很大。
我在Excel中有一些解决方案,但是这样做很麻烦且痛苦。
使用SELECT在Oracle中甚至可以做到吗?
示例:
列数|解释
1 | roses are red violets are blue
2 | red violets
3 | red
4 | roses
5 | blue
结果:
WORDS | COUNTING
roses | 2
are | 2
red | 3
violets | 2
blue | 2
查询变量:
with test as
(select 1 as nor, 'roses are red violets are blue' as explanation from dual union all
select 2 as nor, 'red violets' as explanation from dual union all
select 3 as nor, 'red' as explanation from dual union all
select 4 as nor, 'roses' as explanation from dual union all
select 5 as nor, 'blue' as explanation from dual
),
temp as
(select nor,
trim(column_value) word
from test join xmltable(('"' || replace(explanation, ' ', '","') ||'"')) on 1 = 1
)
select word,
count(*)
from temp
group by word
order by word;
返回ORA-00905:缺少关键字
答案 0 :(得分:0)
将说明分为几行(以便获得个单词),然后对这些单词应用COUNT
函数。
SQL> with test (nor, explanation) as
2 (select 1, 'roses are red violets are blue' from dual union all
3 select 2, 'red violets' from dual union all
4 select 3, 'red' from dual union all
5 select 4, 'roses' from dual union all
6 select 5, 'blue' from dual
7 ),
8 temp as
9 (select nor,
10 regexp_substr(explanation, '[^ ]+', 1, column_value) word
11 from test join table(cast(multiset(select level from dual
12 connect by level <= regexp_count(explanation, ' ') + 1
13 ) as sys.odcinumberlist)) on 1 = 1
14 )
15 select word,
16 count(*)
17 from temp
18 group by word
19 order by word;
WORD COUNT(*)
------------------------------ ----------
are 2
blue 2
red 3
roses 2
violets 2
SQL>
您提到了entrydate
列,但示例数据中没有任何列,因此-如有必要,请将其包括在TEMP
CTE中。
[编辑:呵呵,Oracle 9i ...回到黑暗时代]
看看是否有帮助;我希望能做到:
SQL> with test (nor, explanation) as
2 (select 1, 'roses are red violets are blue' from dual union all
3 select 2, 'red violets' from dual union all
4 select 3, 'red' from dual union all
5 select 4, 'roses' from dual union all
6 select 5, 'blue' from dual
7 ),
8 temp as
9 (select nor,
10 trim(column_value) word
11 from test join xmltable(('"' || replace(explanation, ' ', '","') ||'"')) on 1 = 1
12 )
13 select word,
14 count(*)
15 from temp
16 group by word
17 order by word;
WORD COUNT(*)
-------------------- ----------
are 2
blue 2
red 3
roses 2
violets 2
SQL>
答案 1 :(得分:0)
-- Oracle 12c+
with test (nor, explanation) as (
select 1, 'roses are red violets are blue' from dual union all
select 2, 'red violets' from dual union all
select 3, 'red' from dual union all
select 4, 'roses' from dual union all
select 5, 'blue' from dual)
select regexp_substr(explanation, '\S+', 1, lvl) word, count(*) cnt
from test,
lateral(
select rownum lvl
from dual
connect by level <= regexp_count(explanation, '\S+')
)
group by regexp_substr(explanation, '\S+', 1, lvl);
WORD CNT
------------------------------ ----------
roses 2
are 2
violets 2
red 3
blue 2
答案 2 :(得分:0)
问题出在您的旧Oracle版本中。此查询应该有效,它只有基本的connect by
,instr
和dbms_random
:
select word, count(1) counting
from (
select id, trim(case pos2 when 0 then substr(description, pos1)
else substr(description, pos1, pos2 - pos1)
end) word
from (
select id, description,
case level when 1 then 1 else instr(description, ' ', 1, level - 1) end pos1,
instr(description, ' ', 1, level) pos2
from t
connect by prior dbms_random.value is not null
and prior id = id
and level <= length(description) - length(replace(description, ' ', '')) + 1))
group by word