Postgresql - 正则表达式拆分带有潜在报价的csv行

时间:2017-02-02 18:15:38

标签: postgresql csv postgresql-9.4

我想在postgres中拆分代表csv行的列。此文本行中的字段由管道分隔,有时它们由引号括起,有时则不包含。此外,我们可以逃脱角色。

field1|"field2"|field3|"22 \" lcd \| screen "

是否有正则表达式来拆分此列(即使用regexp_split_to_array(....)?)

1 个答案:

答案 0 :(得分:1)

不是关于正则表达式但它有效

create or replace function split_csv(
  line text,
  delim_char char(1) = ',',
  quote_char char(1) = '"')
returns setof text[] immutable language plpythonu as $$
  import csv
  return csv.reader(line.splitlines(), quotechar=quote_char, delimiter=delim_char, skipinitialspace=True, escapechar='\\')
$$;

select *, x[4] from split_csv('field1|"field2"|field3|"22 \" lcd \| screen "'||E'\n'||'a|b', delim_char := '|') as x;
╔══════════════════════════════════════════════╤════════════════════╗
║                      x                       │         x          ║
╠══════════════════════════════════════════════╪════════════════════╣
║ {field1,field2,field3,"22 \" lcd | screen "} │ 22 " lcd | screen  ║
║ {a,b}                                        │ ░░░░               ║
╚══════════════════════════════════════════════╧════════════════════╝