oracle sql根据每个特定字符集的出现将文本拆分成列

时间:2018-01-12 08:57:44

标签: sql oracle

在我们的数据库(Oracle)中,有一个名为 CONVERSATION 的字段,其中包含语音到文本记录(格式为CLOB)。 经过一些预处理和替换不必要的字符后,目前这个字段的格式如下例所示。

我想将代理和客户的文本分成不同的列。我希望用逗号分隔每个部分以“a:”或“c:”开头。 我怎么能这样做?

“a:”代表代理商,“c:”代表客户

CREATE TABLE TEXT_RECORDS (
    CONVERSATION CLOB
    );

INSERT INTO TEXT_RECORDS
(CONVERSATION)
VALUES
('a:some text 1 c:some text 2 a:some text 3 c:some text 4 a:some text 5 c:some text 6'); 

- EDITED(以前是'a:some_text_1 c:some_text_2 a:some_text_3 c:some_text_4 a:some_text_5 c:some_text_6')

所需输出为两个单独的字段:

CONV_AGENT                              CONV_CUSTOMER
some text 1 ,some text 3, some text 5   some text 2 ,some text 4, some text 6

2 个答案:

答案 0 :(得分:2)

您可以删除没有正确前缀的子字符串:

SQL Fiddle

Oracle 11g R2架构设置

CREATE TABLE TEXT_RECORDS (
    CONVERSATION CLOB
    );

INSERT INTO TEXT_RECORDS(CONVERSATION)
SELECT 'a:some_text_1 c:some_text_2 a:some_text_3 c:some_text_4 a:some_text_5 c:some_text_6' FROM DUAL UNION ALL
SELECT 'a:some_text_1 a:some_text_2 a:some_text_3' FROM DUAL UNION ALL
SELECT 'c:some_text_1 a:some_text_2 a:some_text_3 c:some_text_4' FROM DUAL;

查询1

SELECT REGEXP_REPLACE(
         REGEXP_REPLACE(
           REGEXP_REPLACE(
             conversation,
             '.*?(a:(\S+))?(\s|$)',  -- Find each word starting with "a:"
             '\2, '                  -- replace with just that part without prefix
           ),
           '(, ){2,}', -- Replace multiple delimiters
           ', '        -- With a single delimiter
         ),
         '^, |, $'     -- Remove leading and trailing delimiters
       ) AS conv_agent,
       REGEXP_REPLACE(
         REGEXP_REPLACE(
           REGEXP_REPLACE(
             conversation,
             '.*?(c:(\S+))?(\s|$)',  -- Find each word starting with "c:"
             '\2, '                  -- replace with just that part without prefix
           ),
           '(, ){2,}', -- Replace multiple delimiters
           ', '        -- With a single delimiter
         ),
         '^, |, $'     -- Remove leading and trailing delimiters
       ) AS conv_customer
FROM   text_records

<强> Results

|                            CONV_AGENT |                         CONV_CUSTOMER |
|---------------------------------------|---------------------------------------|
| some_text_1, some_text_3, some_text_5 | some_text_2, some_text_4, some_text_6 |
| some_text_1, some_text_2, some_text_3 |                                       |
|              some_text_2, some_text_3 |              some_text_1, some_text_4 |

已更新 - 会话句中的空格

SQL Fiddle

Oracle 11g R2架构设置

CREATE TABLE TEXT_RECORDS (
    CONVERSATION CLOB
    );

INSERT INTO TEXT_RECORDS(CONVERSATION)
SELECT 'a:some text 1 c:some text 2 a:some text 3 c:some text 4 a:some text 5 c:some text 6' FROM DUAL UNION ALL
SELECT 'a:some text 1 a:some text 2 a:some text 3' FROM DUAL UNION ALL
SELECT 'c:some text 1 a:some text 2 a:some text 3 c:some text 4' FROM DUAL;

查询1

SELECT REGEXP_REPLACE(
         REGEXP_REPLACE(
           REGEXP_REPLACE(
             conversation,
             '.*?(a:([^:]*))?(\s|$)',
             '\2, '
           ),
           '(, ){2,}',
           ', '
         ),
         '^, |, $'
       ) AS conv_agent,
       REGEXP_REPLACE(
         REGEXP_REPLACE(
           REGEXP_REPLACE(
             conversation,
             '.*?(c:([^:]*))?(\s|$)',
             '\2, '
           ),
           '(, ){2,}',
           ', '
         ),
         '^, |, $'
       ) AS conv_customer
FROM   text_records

<强> Results

|                            CONV_AGENT |                         CONV_CUSTOMER |
|---------------------------------------|---------------------------------------|
| some text 1, some text 3, some text 5 | some text 2, some text 4, some text 6 |
| some text 1, some text 2, some text 3 |                                       |
|              some text 2, some text 3 |              some text 1, some text 4 |

答案 1 :(得分:0)

您可以创建两个功能,一个用于获取座席对话,另一个用于客户对话,请参阅下面的功能以获取座席对话。

CREATE OR REPLACE FUNCTION get_agent_conv(p_text CLOB) RETURN clob
IS
    v_indx NUMBER := 1;
    v_agent_conv CLOB;
    v_occur NUMBER := 0;
BEGIN
    LOOP
        v_occur := v_occur + 1;
        v_indx := DBMS_LOB.INSTR(p_text, 'a:', 1, v_occur);
        v_agent_conv := v_agent_conv||', '||SUBSTR(p_text, v_indx+2, (DBMS_LOB.INSTR(p_text, 'c:', 1, v_occur)-4)-(v_indx-1));
    EXIT WHEN v_indx = 0;
    END LOOP;
    RETURN TRIM(', ' FROM v_agent_conv);
END;
/


SELECT GET_AGENT_CONV(conversation) agent_conversation
  FROM text_records;


AGENT_CONVERSATION                                                             
-------------------------------------
some_text_1, some_text_3, some_text_5