将字符串拆分为多行

时间:2013-11-19 18:08:57

标签: sql oracle oracle11g

我有一个带有字符串的表,其中包含几个分隔值,例如: a;b;c

我需要拆分此字符串并在查询中使用其值。例如,我有以下表格:

str
a;b;c
b;c;d
a;c;d

我需要按str列中的单个值进行分组,以获得以下结果:

str count(*)
a   1
b   2
c   3
d   2

是否可以使用单选查询实现?我无法创建临时表来提取那里的值并查询该临时表。

3 个答案:

答案 0 :(得分:4)

从评论到@PrzemyslawKruglej answer

  

主要问题是使用connect by的内部查询,它会生成惊人数量的行

可以使用以下方法减少生成的行数:

/* test table populated with sample data from your question */
SQL> create table t1(str) as(
  2    select 'a;b;c'  from dual union all
  3    select 'b;c;d'  from dual union all
  4    select 'a;c;d'  from dual
  5  );
Table created

--  number of rows generated will solely depend on the most longest 
--  string. 
--  If (say) the longest string contains 3 words (wont count separator `;`)
--  and we have 100 rows in our table, then we will end up with 300 rows 
--  for further processing , no more.
with occurrence(ocr) as( 
  select level 
    from ( select max(regexp_count(str, '[^;]+')) as mx_t
             from t1 ) t
    connect by level <= mx_t 
)
select count(regexp_substr(t1.str, '[^;]+', 1, o.ocr)) as generated_for_3_rows
  from t1
 cross join occurrence o;

结果:对于最长的一行由三个单词组成的三行,我们将生成9行

GENERATED_FOR_3_ROWS
--------------------
                  9

最终查询:

with occurrence(ocr) as( 
  select level 
    from ( select max(regexp_count(str, '[^;]+')) as mx_t
             from t1 ) t
    connect by level <= mx_t 
)
select res
     , count(res) as cnt
  from (select regexp_substr(t1.str, '[^;]+', 1, o.ocr) as res
          from t1
         cross join occurrence o)
 where res is not null
 group by res
 order by res;

结果:

RES          CNT
----- ----------
a              2
b              2
c              3
d              2

SQLFIddle Demo

详细了解regexp_count()(11g及以上)和regexp_substr()正则表达式函数。

注意:正则表达式函数的计算成本相对较高,而且在处理大量数据时,可能需要考虑切换到普通的PL / SQL。 Here is an example

答案 1 :(得分:1)

这很难看,但似乎有效。 CONNECT BY拆分的问题是它返回重复的行。我设法摆脱它们,但你必须测试它:

WITH
  data AS (
    SELECT 'a;b;c' AS val FROM dual
    UNION ALL SELECT 'b;c;d' AS val FROM dual
    UNION ALL SELECT 'a;c;d' AS val FROM dual
  )
SELECT token, COUNT(1)
  FROM (
    SELECT DISTINCT token, lvl, val, p_val
      FROM (
        SELECT
            regexp_substr(val, '[^;]+', 1, level) AS token,
            level AS lvl,
            val,
            NVL(prior val, val) p_val
          FROM data
        CONNECT BY regexp_substr(val, '[^;]+', 1, level) IS NOT NULL
      )
    WHERE val = p_val
  )
GROUP BY token;
TOKEN                  COUNT(1)
-------------------- ----------
d                             2 
b                             2 
a                             2 
c                             3 

答案 2 :(得分:0)

SELECT NAME,COUNT(NAME) FROM ( SELECT NAME  FROM ( (SELECT rownum as ID, REGEXP_SUBSTR('a;b;c', '[^;]+', 1, LEVEL )  NAME
       FROM dual  CONNECT BY REGEXP_SUBSTR('a;b;c', '[^;]+', 1, LEVEL) IS NOT NULL))
       UNION ALL  (SELECT NAME  FROM ( (SELECT rownum as ID, REGEXP_SUBSTR('b;c;d', '[^;]+', 1, LEVEL )  NAME
       FROM dual  CONNECT BY REGEXP_SUBSTR('b;c;d', '[^;]+', 1, LEVEL) IS NOT NULL)))
       UNION ALL 
         (SELECT NAME  FROM ( (SELECT rownum as ID, REGEXP_SUBSTR('a;c;d', '[^;]+', 1, LEVEL )  NAME
       FROM dual  CONNECT BY REGEXP_SUBSTR('a;c;d', '[^;]+', 1, LEVEL) IS NOT NULL)))) GROUP BY NAME

NAME  COUNT(NAME)
----- -----------
d               2
a               2
b               2
c               3