如何在postgres查询中按正则表达式进行分组

时间:2017-03-10 21:44:19

标签: sql postgresql postgresql-8.4

我正在清理其他人的安静应用程序,并且这样做似乎有一些路由没有被使用。为了开始故障排除,我创建了一个带有唯一文本列的表来存储路由。

---------routes---------
https://test.com/user/1/info
https://test.com/test/2/info
https://test.com/banana/100
https://test.com/post/3/date
https://test.com/post/
https://test.com/grape/
http://test.com/post/3/date
https://test.com/banana/3
https://test.com/user/2/info
https://test.com/test/5/info
.
.
.

现在我喜欢做的是,使用一些正则表达式(或其他)查​​询,对上面的条目进行分组以获得以下结果:

---------routes---------
https://test.com/user/{x}/info
https://test.com/test/{x}/info
https://test.com/post/{x}/date
https://test.com/post/
https://test.com/grape/
http://test.com/post/{x}/date
https://test.com/banana/{x}

其中{x}只是由于分组而产生的一些通用标记。我知道我们可以搜索特定的正则表达式,但我不知道如何尝试将字符串折叠成组,然后吐出“推荐的”分组

PS:因为我们陷入了石器时代,任何解决方案都受到postgresql 8.4.20的限制

编辑 -

克林,你的答案对我来说并不是很有效,因为它给了我

     regexp_replace        | count 
------------------------------+-------
 https://test.com/user/1/info |     1
 https://test.com/test/2/info |     1
 https://test.com/banana/100  |     1
 \x01{x}ate                   |     2
 https://test.com/user/2/info |     1
 https://test.com/grape/      |     1
 https://test.com/test/5/info |     1
 https://test.com/post/       |     1
 https://test.com/banana/3    |     1
(9 rows)

但至少这给了我一些想法,当我再玩一次时我会回复

1 个答案:

答案 0 :(得分:3)

我无法在8.4 ...

中测试这个
with routes(url) as (
values
    ('https://test.com/user/1/info'),
    ('https://test.com/test/2/info'),
    ('https://test.com/banana/100'),
    ('https://test.com/post/3/date'),
    ('https://test.com/post/'),
    ('https://test.com/grape/'),
    ('http://test.com/post/3/date'),
    ('https://test.com/banana/3'),
    ('https://test.com/user/2/info'),
    ('https://test.com/test/5/info')
)

select regexp_replace(url, '^(.+//.+/.+/)\d+', '\1{x}'), count(*)
from routes
group by 1

         regexp_replace         | count 
--------------------------------+-------
 https://test.com/banana/{x}    |     2
 https://test.com/post/{x}/date |     1
 http://test.com/post/{x}/date  |     1
 https://test.com/user/{x}/info |     2
 https://test.com/test/{x}/info |     2
 https://test.com/grape/        |     1
 https://test.com/post/         |     1
(7 rows)    

You can test this here (Postgres 9.5).

Check pattern here.