Question

我之前的question已经解决了。现在我需要开发一个相关但更复杂的查询。

我有一张这样的表：

id     description          additional_info
-------------------------------------------
123    games                XYD
124    Festivals sport      swim

我需要将匹配计算到这样的数组：

array_content varchar[] := {"Festivals,games","sport,swim"}

如果列description和additional_info中的任何一个包含用逗号分隔的任何标记，我们将其计为1.因此每个数组元素（由多个单词组成）只能贡献1总数。

上述示例的结果应为：

id    RID    Matches
1     123    1
2     124    2

Answer 1

答案并不简单，但弄清楚你在问什么更难：

SELECT row_number() OVER (ORDER BY t.id) AS id
     , t.id AS "RID"
     , count(DISTINCT a.ord) AS "Matches"
FROM   tbl t
LEFT   JOIN (
   unnest(array_content) WITH ORDINALITY x(elem, ord)
   CROSS JOIN LATERAL
   unnest(string_to_array(elem, ',')) txt
   ) a ON t.description ~ a.txt
       OR t.additional_info ~ a.txt
GROUP  BY t.id;

准确地产生您想要的结果 array_content是您的搜索字词数组。

这是如何工作的？

搜索词中外部数组的每个数组元素都是以逗号分隔的列表。通过取消两次（在将外部数组的每个元素转换为另一个数组之后）来分解奇数构造。例如：

SELECT *
FROM   unnest('{"Festivals,games","sport,swim"}'::varchar[]) WITH ORDINALITY x(elem, ord)
CROSS  JOIN LATERAL
       unnest(string_to_array(elem, ',')) txt;

结果：

 elem            | ord |  txt
-----------------+-----+------------
 Festivals,games | 1   | Festivals
 Festivals,games | 1   | games
 sport,swim      | 2   | sport
 sport,swim      | 2   | swim

由于您希望计算每个外部数组元素一次的匹配项，因此我们使用WITH ORDINALITY动态生成唯一编号。详细说明：

PostgreSQL unnest() with element number

现在我们可以在期望的匹配条件下LEFT JOIN到这个派生表：

   ... ON t.description ~ a.txt
       OR t.additional_info ~ a.txt

..并使用count(DISTINCT a.ord)获取计数，即使多个搜索字词匹配，每个数组也只计算一次。

最后，我在id的结果中添加了神秘的row_number() OVER (ORDER BY t.id) AS id - 假设它应该是序列号。瞧。

与前一个问题相同的正则表达式匹配（~）的注意事项适用：

Postgres query to calculate matching strings

计算嵌套数组中多列和单词之间的匹配

1 个答案:

这是如何工作的？