Question

下面有这个表（称为数据）：

row    comments
  1    Fortune favors https://something.aaa.org/show_screen.cgi?id=548545 the 23 bold
  2    No man 87485 is id# 548522 an island 65654.       
  3    125 Better id NEWLINE #546654 late than 5875565 never.
  4    555 Better id546654 late than 565 never

我使用了以下查询：

select row, substring(substring(comments::text, '((id|ID) [0-9]+)'), '[0-9]+') as id 
from data 
where comments::text ~* 'id [0-9]+';

此查询输出忽略了第1行到第3行。它只处理了第4行：

row   id
 4    546654

你们当中有些人知道如何正确分割身份证号码吗？请注意，ID最多包含9位数字。

Answer 1

使用regexp_replace（）：

SELECT c.rownr
        , regexp_replace (c.comments, e'.*[Ii][Dd][^0-9]*([0-9]+).*', '\1' ) AS the_id
        , c.comments AS comments
FROM comments c
        ;

.*匹配初始垃圾
`[Ii] [Dd]匹配Id字符串，案例无关紧要
[^0-9]*使用非数字字符
([0-9]+)匹配您想要的数字字符串
.*匹配任何尾随字符
'\1'（在第3个参数中）告诉您希望匹配在第一个()

结果：

 rownr | the_id |                         comments                                    
-------+--------+--------------------------------------------------------------------------------
     1 | 548545 | Fortune favors https://something.aaa.org/show_screen.cgi?id=548545 the 23 bold
     2 | 548522 | No man 87485 is id# 548522 an island 65654.       
     3 | 546654 | 125 Better id NEWLINE #546654 late than 5875565 never.
     4 | 546654 | 555 Better id546654 late than 565 never
(4 rows)

从字符串中拆分特定的数字链

1 个答案: