Question

我们如何为一列中的值创建任何长度的所有组合，并返回该组合的另一列的非重复计数？

表：

+------+--------+
| Type |  Name  |
+------+--------+
| A    | Tom    |
| A    | Ben    |
| B    | Ben    |
| B    | Justin |
| C    | Ben    |
+------+--------+

输出表：

+-------------+-------+
| Combination | Count |
+-------------+-------+
| A           |     2 |
| B           |     2 |
| C           |     1 |
| AB          |     3 |
| BC          |     2 |
| AC          |     2 |
| ABC         |     3 |
+-------------+-------+

当组合只有A时，有Tom和Ben，所以它是2。

当组合只有B时，2个不同的名称，所以它是2。

当组合是A和B时，有3个不同的名字：Tom，Ben，Justin，所以它是3。

我在Amazon Redshift工作。谢谢！

Answer 1

注意：这回答了标记为Postgres的问题的原始版本。

您可以使用此代码生成所有组合

with recursive td as (
      select distinct type
      from t
     ),
     cte as (
      select td.type, td.type as lasttype, 1 as len
      from td
      union all
      select cte.type || t.type, t.type as lasttype, cte.len + 1
      from cte join
           t
           on 1=1 and t.type > cte.lasttype
     )

然后，您可以在join：

中使用此功能

with recursive t as (
      select *
      from (values ('a'), ('b'), ('c'), ('d')) v(c)
     ),
     cte as (
      select t.c, t.c as lastc, 1 as len
      from t
      union all
      select cte.type || t.type, t.type as lasttype, cte.len + 1
      from cte join
           t
           on 1=1 and t.type > cte.lasttype
     )
select type, count(*)
from (select name, cte.type, count(*)
      from cte join
           t
           on cte.type like '%' || t.type || '%'
      group by name, cte.type
      having count(*) = length(cte.type)
     ) x
group by type
order by type;

Answer 2

无法在Amazon Redshift中生成所有可能的组合（A，B，C，AB，AC，BC等）。

（好吧，你可以选择每个唯一值，将它们变成一个字符串，将其发送到用户定义函数，将结果提取到多行，然后将其与一个大查询相结合，但这实际上并非如此。你想尝试的东西。）

一种方法是创建一个包含所有可能组合的表 - 您需要编写一个小程序来执行此操作（例如，在Python中使用itertools）。然后，您可以合理地加入数据，以获得所需的结果（例如IF 'ABC' CONTAINS '%A%'）。

Redshift为一列中的值创建任意长度的所有组合

2 个答案: