在我们的数据库(Oracle)中,有一个名为 CONVERSATION 的字段,其中包含语音到文本记录(格式为CLOB)。 经过一些预处理和替换不必要的字符后,目前这个字段的格式如下例所示。
我想将代理和客户的文本分成不同的列。我希望用逗号分隔每个部分以“a:”或“c:”开头。 我怎么能这样做?
“a:”代表代理商,“c:”代表客户
CREATE TABLE TEXT_RECORDS (
CONVERSATION CLOB
);
INSERT INTO TEXT_RECORDS
(CONVERSATION)
VALUES
('a:some text 1 c:some text 2 a:some text 3 c:some text 4 a:some text 5 c:some text 6');
- EDITED(以前是'a:some_text_1 c:some_text_2 a:some_text_3 c:some_text_4 a:some_text_5 c:some_text_6')
所需输出为两个单独的字段:
CONV_AGENT CONV_CUSTOMER
some text 1 ,some text 3, some text 5 some text 2 ,some text 4, some text 6
答案 0 :(得分:2)
您可以删除没有正确前缀的子字符串:
Oracle 11g R2架构设置:
CREATE TABLE TEXT_RECORDS (
CONVERSATION CLOB
);
INSERT INTO TEXT_RECORDS(CONVERSATION)
SELECT 'a:some_text_1 c:some_text_2 a:some_text_3 c:some_text_4 a:some_text_5 c:some_text_6' FROM DUAL UNION ALL
SELECT 'a:some_text_1 a:some_text_2 a:some_text_3' FROM DUAL UNION ALL
SELECT 'c:some_text_1 a:some_text_2 a:some_text_3 c:some_text_4' FROM DUAL;
查询1 :
SELECT REGEXP_REPLACE(
REGEXP_REPLACE(
REGEXP_REPLACE(
conversation,
'.*?(a:(\S+))?(\s|$)', -- Find each word starting with "a:"
'\2, ' -- replace with just that part without prefix
),
'(, ){2,}', -- Replace multiple delimiters
', ' -- With a single delimiter
),
'^, |, $' -- Remove leading and trailing delimiters
) AS conv_agent,
REGEXP_REPLACE(
REGEXP_REPLACE(
REGEXP_REPLACE(
conversation,
'.*?(c:(\S+))?(\s|$)', -- Find each word starting with "c:"
'\2, ' -- replace with just that part without prefix
),
'(, ){2,}', -- Replace multiple delimiters
', ' -- With a single delimiter
),
'^, |, $' -- Remove leading and trailing delimiters
) AS conv_customer
FROM text_records
<强> Results 强>:
| CONV_AGENT | CONV_CUSTOMER |
|---------------------------------------|---------------------------------------|
| some_text_1, some_text_3, some_text_5 | some_text_2, some_text_4, some_text_6 |
| some_text_1, some_text_2, some_text_3 | |
| some_text_2, some_text_3 | some_text_1, some_text_4 |
已更新 - 会话句中的空格
Oracle 11g R2架构设置:
CREATE TABLE TEXT_RECORDS (
CONVERSATION CLOB
);
INSERT INTO TEXT_RECORDS(CONVERSATION)
SELECT 'a:some text 1 c:some text 2 a:some text 3 c:some text 4 a:some text 5 c:some text 6' FROM DUAL UNION ALL
SELECT 'a:some text 1 a:some text 2 a:some text 3' FROM DUAL UNION ALL
SELECT 'c:some text 1 a:some text 2 a:some text 3 c:some text 4' FROM DUAL;
查询1 :
SELECT REGEXP_REPLACE(
REGEXP_REPLACE(
REGEXP_REPLACE(
conversation,
'.*?(a:([^:]*))?(\s|$)',
'\2, '
),
'(, ){2,}',
', '
),
'^, |, $'
) AS conv_agent,
REGEXP_REPLACE(
REGEXP_REPLACE(
REGEXP_REPLACE(
conversation,
'.*?(c:([^:]*))?(\s|$)',
'\2, '
),
'(, ){2,}',
', '
),
'^, |, $'
) AS conv_customer
FROM text_records
<强> Results 强>:
| CONV_AGENT | CONV_CUSTOMER |
|---------------------------------------|---------------------------------------|
| some text 1, some text 3, some text 5 | some text 2, some text 4, some text 6 |
| some text 1, some text 2, some text 3 | |
| some text 2, some text 3 | some text 1, some text 4 |
答案 1 :(得分:0)
您可以创建两个功能,一个用于获取座席对话,另一个用于客户对话,请参阅下面的功能以获取座席对话。
CREATE OR REPLACE FUNCTION get_agent_conv(p_text CLOB) RETURN clob
IS
v_indx NUMBER := 1;
v_agent_conv CLOB;
v_occur NUMBER := 0;
BEGIN
LOOP
v_occur := v_occur + 1;
v_indx := DBMS_LOB.INSTR(p_text, 'a:', 1, v_occur);
v_agent_conv := v_agent_conv||', '||SUBSTR(p_text, v_indx+2, (DBMS_LOB.INSTR(p_text, 'c:', 1, v_occur)-4)-(v_indx-1));
EXIT WHEN v_indx = 0;
END LOOP;
RETURN TRIM(', ' FROM v_agent_conv);
END;
/
SELECT GET_AGENT_CONV(conversation) agent_conversation
FROM text_records;
AGENT_CONVERSATION
-------------------------------------
some_text_1, some_text_3, some_text_5