获取列sql中的单词计数

时间:2014-03-26 15:15:39

标签: sql oracle word-count

以下查询后

SELECT * FROM table;

SELECT REGEXP_REPLACE(description || '!', '[^[:punct:]]') 
    FROM table;

SELECT REGEXP_REPLACE ( description, '[' ||  REGEXP_REPLACE ( description || '!', '[^[:punct:]]')  || ']') test 
    FROM table;

SELECT REGEXP_REPLACE(UPPER(TEST), ' ', '#') test 
    FROM (SELECT REGEXP_REPLACE (description, '[' ||  REGEXP_REPLACE (description || '!', '[^[:punct:]]')  || ']') test 
    FROM table);

我在oracle sql中有一个列如下:

TEST
 ---------------------------------------------
 SPOKE#WITH#MR#SMITHS#ASSISTANT
 EMAILED#FOR#VISIT
 SCHEDULING#OFFICE#LM#FOR#VISIT
 LM#FOR#VISIT
 LM#FOR#VISIT
 PHONE#CALL
 ---------------------------------------------

所有单词都用#分隔。我想得到单词出现次数,例如:

word | count
------------
LM   |  3
FOR  |  4
VISIT|  4
PHONE|  1

等等。我是oracle sql的新手,我只熟悉基本的mysql命令。任何帮助或教程指针也会有所帮助。谢谢。

编辑:大约有1500行,其中有大约250个独特的回复,我正在尝试解释

1 个答案:

答案 0 :(得分:2)

WITH mydata AS
  ( SELECT 'SPOKE#WITH#MR#SMITHS#ASSISTANT' AS str FROM dual
    UNION ALL
    SELECT 'EMAILED#FOR#VISIT' FROM dual
    UNION ALL
    SELECT 'SCHEDULING#OFFICE#LM#FOR#VISIT' FROM dual
    UNION ALL
    SELECT 'LM#FOR#VISIT' FROM dual
    UNION ALL
    SELECT 'LM#FOR#VISIT' FROM dual
    UNION ALL
    SELECT 'PHONE#CALL' FROM dual
  ),
  splitted_words AS
  (
    SELECT REGEXP_SUBSTR(str,'[^#]+', 1, level) AS word
    FROM mydata
      CONNECT BY level   <= LENGTH(regexp_replace(str,'[^#]')) + 1
    AND PRIOR str         = str
    AND PRIOR sys_guid() IS NOT NULL
  )
SELECT word,
      COUNT(1)
FROM splitted_words
GROUP BY word;

如果您的表格为YOUR_TABLE且列为YOUR_COLUMN

  WITH splitted_words AS
  (
    SELECT REGEXP_SUBSTR(YOUR_COLUMN,'[^#]+', 1, level) AS word
    FROM YOUR_TABLE
      CONNECT BY level   <= LENGTH(regexp_replace(YOUR_COLUMN,'[^#]')) + 1
    AND PRIOR YOUR_COLUMN         = YOUR_COLUMN
    AND PRIOR sys_guid() IS NOT NULL
  )
SELECT word,
      COUNT(1)
FROM splitted_words
GROUP BY word;