Oracle REGEXP_SUBSTR函数在最近几个月对我来说一直很好。但是突然,内容如下:
"field 1 "",""","field 2","field 3", "","","","field 7"
在这种情况下,预期的匹配信息为(https://regex101.com/r/s2v60b/1):
field 1: "field 1 "","""
field 2: "field 2"
field 3: "field 3"
field 4: ""
field 5: ""
field 6: ""
field 7: "field 7"
即使VS Code知道我的意思,因为它可以按颜色正确分割字段:
但是当我使用查询在Oracle中评估以下命令时:
SELECT
REGEXP_SUBSTR(
'"field 1 "",""","field 2","field 3", "","","","field 7"'
, '(^|,)("((?:""|[^"])*)")', 1, 1, '', 2) TEXT
FROM DUAL;
field 1
被截断为"field 1 "
,移动其余字段。
您知道我在做什么错,也许可以纠正吗?
答案 0 :(得分:1)
Oracle不支持非捕获组,因此您不能使用(?:)
只是删除?:
使其成为捕获组,您的代码应该可以工作(您可能需要添加{{1 }},因为逗号和起始引号\s*
之间有空格。
例如:
(^|,)\s*("((""|[^"])*)")
输出,用于您的测试数据:
FIELD1 | FIELD2 | FIELD3 | FIELD4 | FIELD5 | FIELD6 | FIELD7 :-------------- | :-------- | :-------------- | :------------------ | :-------------- | :----- | :-------- "field 1 "",""" | "field 2" | "field 3" | "" | "" | "" | "field 7"
如果要匹配带引号和不带引号的值,可以使用:
SELECT REGEXP_SUBSTR( csv, '(^|,)\s*("((""|[^"])*)")', 1, 1, NULL, 2 ) AS field1,
REGEXP_SUBSTR( csv, '(^|,)\s*("((""|[^"])*)")', 1, 2, NULL, 2 ) AS field2,
REGEXP_SUBSTR( csv, '(^|,)\s*("((""|[^"])*)")', 1, 3, NULL, 2 ) AS field3,
REGEXP_SUBSTR( csv, '(^|,)\s*("((""|[^"])*)")', 1, 4, NULL, 2 ) AS field4,
REGEXP_SUBSTR( csv, '(^|,)\s*("((""|[^"])*)")', 1, 5, NULL, 2 ) AS field5,
REGEXP_SUBSTR( csv, '(^|,)\s*("((""|[^"])*)")', 1, 6, NULL, 2 ) AS field6,
REGEXP_SUBSTR( csv, '(^|,)\s*("((""|[^"])*)")', 1, 7, NULL, 2 ) AS field7
FROM table_name
其中的示例数据:
SELECT REGEXP_SUBSTR( csv, '([^",]*|"([^"]|"")*")(,|$)', 1, 1, NULL, 1 ) AS field1,
REGEXP_SUBSTR( csv, '([^",]*|"([^"]|"")*")(,|$)', 1, 2, NULL, 1 ) AS field2,
REGEXP_SUBSTR( csv, '([^",]*|"([^"]|"")*")(,|$)', 1, 3, NULL, 1 ) AS field3,
REGEXP_SUBSTR( csv, '([^",]*|"([^"]|"")*")(,|$)', 1, 4, NULL, 1 ) AS field4,
REGEXP_SUBSTR( csv, '([^",]*|"([^"]|"")*")(,|$)', 1, 5, NULL, 1 ) AS field5,
REGEXP_SUBSTR( csv, '([^",]*|"([^"]|"")*")(,|$)', 1, 6, NULL, 1 ) AS field6,
REGEXP_SUBSTR( csv, '([^",]*|"([^"]|"")*")(,|$)', 1, 7, NULL, 1 ) AS field7
FROM table_name
输出:
FIELD1 | FIELD2 | FIELD3 | FIELD4 | FIELD5 | FIELD6 | FIELD7 :-------------- | :-------- | :-------- | :-------------- | :------- | :------------------ | :-------------- "field 1 "",""" | "field 2" | "field 3" | "" | "" | "" | "field 7" "field 1.1" | 2.1 | "3.1" | "field ""4"".1" | field5.1 | "field ""6"".""1""" | """field 7.1"""
db <>提琴here