我知道如何使用utl_file从文件中读取数据,但我遇到下面数据的问题。我们不能假设数据将在哪个位置变为空。我们怎么办呢?
示例数据:apple|bat|cat|"dog | dog"||||eee||abc
预期产出:
COL1:苹果
COL2:蝙蝠
COL3:猫
col4:狗狗
COL5:
COL6:
COL7:
col8:EEE
col9:
col10:ABC
我试过下面的代码,但它没有处理空值
declare
list varchar2(3500) :='apple|bat|cat|"dog | dog"||||eee||abc';
pattern varchar2(20) := '(" [^"]*"|[^|]+)';
i number:=0;
j number;
f varchar2(3500);
c sys_refcursor;
begin
dbms_output.put_line('Raw list: ' || list);
open c for
select level as col,
trim(regexp_substr(replace(list,'|','|'), pattern, 1, rownum))split
from dual
connect by level <= length(regexp_replace(list, pattern)) + 1;
loop
fetch c into j, f;
exit when c%notfound;
dbms_output.put_line('Column ' || i || ': ' || replace(f, '"'));
i:=i+1;
end loop;
close c;
end;
我的输出低于输出,但我需要预期的输出。
原始列表:apple | bat | cat |“dog | dog”|||| eee || abc
第0栏:苹果
第1栏:蝙蝠
第2栏:猫
第3栏:狗 第4栏:狗 第5栏:eee
第6栏:abc
第7栏:
第8栏:
第9栏:
第10栏:
答案 0 :(得分:0)
我不会为此使用正则表达式。
相反,我使用普通的字符串操作函数来破坏字符串。例如:
declare
list varchar2(3500) := 'apple|bat|cat|"dog | dog"||||eee||abc';
next_pipe_pos integer;
close_quote_pos integer;
column_start_pos integer;
column_num integer;
column_text varchar2(4000);
begin
column_start_pos := 1;
column_num := 1;
-- Appending a | character allows us to assume that all columns have a
-- pipe following them and avoids any special-case handling for the last
-- column.
list := list || '|';
while column_start_pos <= length(list)
loop
if substr(list, column_start_pos, 1) = '"' then
close_quote_pos := instr(list, '"|', column_start_pos + 1);
if close_quote_pos = 0 then
-- Mismatched quotes.
raise no_data_found;
end if;
column_text := substr(list, column_start_pos + 1, close_quote_pos - column_start_pos - 1);
column_start_pos := close_quote_pos + 2;
else
next_pipe_pos := instr(list, '|', column_start_pos);
exit when next_pipe_pos = 0;
column_text := substr(list, column_start_pos, next_pipe_pos - column_start_pos);
column_start_pos := next_pipe_pos + 1;
end if;
dbms_output.put_line('Column ' || column_num || ': ' || column_text);
column_num := column_num + 1;
end loop;
end;
它的代码更多,但它可能比你正在使用的正则表达式更加神秘。
运行此输出:
Column 1: apple
Column 2: bat
Column 3: cat
Column 4: dog | dog
Column 5:
Column 6:
Column 7:
Column 8: eee
Column 9:
Column 10: abc