我想在postgres中拆分代表csv行的列。此文本行中的字段由管道分隔,有时它们由引号括起,有时则不包含。此外,我们可以逃脱角色。
field1|"field2"|field3|"22 \" lcd \| screen "
是否有正则表达式来拆分此列(即使用regexp_split_to_array(....)?)
答案 0 :(得分:1)
不是关于正则表达式但它有效
create or replace function split_csv(
line text,
delim_char char(1) = ',',
quote_char char(1) = '"')
returns setof text[] immutable language plpythonu as $$
import csv
return csv.reader(line.splitlines(), quotechar=quote_char, delimiter=delim_char, skipinitialspace=True, escapechar='\\')
$$;
select *, x[4] from split_csv('field1|"field2"|field3|"22 \" lcd \| screen "'||E'\n'||'a|b', delim_char := '|') as x;
╔══════════════════════════════════════════════╤════════════════════╗ ║ x │ x ║ ╠══════════════════════════════════════════════╪════════════════════╣ ║ {field1,field2,field3,"22 \" lcd | screen "} │ 22 " lcd | screen ║ ║ {a,b} │ ░░░░ ║ ╚══════════════════════════════════════════════╧════════════════════╝