Question

我有很多行city和direction字段。但是从旧的导入，城市和方向在direction字段中混合。像：

dir number,  extra data, CITY,  AL 111111
dir number, CITY,  AL 111111
number, dir, number, CITY, dir number, CITY,  AL 111111

重要的是'CITY'总是出现在美国邮政编码之前，我想提取它并将其保存在city字段中UPDATE（使用正则表达式？）。有可能吗？

类似的东西：

update TABLE set city = SOME_REGEX_MAGIC_FROM_DIRECTION_FIELD
where direccion ~ 'REGEX_MAGIC'

工作的SQL语句：

update TABLE
set city = substring(direction FROM '([^,]+),[^,]+$')
where direction like '%,  __ _____';

Answer 1

如果你想要在最后一个逗号之前的部分，一种方式（很多）是普通的substring()调用（regexp变体）：

substring(direction FROM ',([^,]+),[^,]+$') AS city

db<>fiddle here

您的UPDATE声明可能如下所示：

UPDATE tbl
SET    city = substring(direction FROM ',([^,]+),[^,]+$')
WHERE  direction ~ ', *\D\D \d{5}$'

Answer 2

根据您的数据，我会收集到您需要的内容：

SELECT regexp_matches('direction_field', '([^,]+) \d{5}');

Answer 3

从Redshift中的正则表达式中获取子字符串：

SELECT REGEXP_SUBSTR(
   'hello_uuid_092bab12-8d8b-40ad-b8b7-bc9f05e52c9c_something_else',
   '([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12})'
)

结果：092bab12-8d8b-40ad-b8b7-bc9f05e52c9c

从PostgreSQL中的文本字段中提取字符串（使用正则表达式？）

3 个答案: