Question

我将使用for / each循环，使用正则表达式在另一个表（table1）中的记录的文本信息中搜索不同的名称（table2）。

SELECT id FROM "table1"
where tags ~* 'south\s?\*?africa'
   or description ~* 'south\s?\*?south'
order by id asc;

但我不知道如何把它放在每个循环中！

table1：

 t1ID | NAME
 1    | Shiraz      
 2    | south africa
 3    | Limmatplatz

table2：

t2ID |TAGS                   | DESCRIPTIONS
101  |shiraz;Zurich;river    | It is too hot in Shiraz and Limmatplatz
201  |southafrica;limmatplatz| we went for swimming

我在table1中有一个名单。另一个表有一些可能包含这些名称的文本信息。我想要回复table2的ID，其中包含table1中包含项目ID的项目。

例如：

t2id | t1id
101  |1
101  |3
201  |2
201  |3

我的表有60,000和550.000行。我需要使用时间明智高效的方式！

Answer 1

您不需要循环。一个简单的连接就可以了。

SELECT t2.id AS t2id, t1.id AS t1id
FROM   table1 t1
JOIN   table1 t2 ON t2.tags        ~* replace(t1.name, ' ', '\s?\*?')
                 OR t2.description ~* replace(t1.name, ' ', '\s?\*?')
ORDER  BY t2.id;

但是大表的表现会糟糕您可以采取以下措施来改进它：

将table2.tags标准化为单独的1：n表格如果重复使用标记，则与tag表的n：m关系（典型情况）。细节：
- How to implement a many-to-many relationship in PostgreSQL?
使用trigram或textsearch索引
- PostgreSQL LIKE query performance variations
使用LATERAL联接实际使用这些索引。
- LATERAL JOIN not using trigram index
理想情况下，使用Postgres 9.6 中的新功能，使用全文搜索功能搜索短语。 The release notes:

全文搜索现在可以搜索短语（多个相邻的单词）

高效/每个循环匹配短语？

1 个答案: