如何计算SQL中的所有组合事件?

时间:2010-11-08 20:25:01

标签: sql postgresql combinations

是否可以选择在一个 SQL查询中获取所有元素的计数组合,而不使用临时表或过程?

考虑这三个表:

  • 产品(id,product_name)

  • 交易(身份证明,日期)

  • transaction_has_product(id,product_id,transaction_id)

示例数据

  • 产品

    1   AAA
    2   BBB
    3   CCC
    
  • 交易

    1   some_date
    2   some_date
    
  • transaction_has_products

    1   1   1
    2   2   1
    3   3   1
    4   1   2
    5   2   2
    

结果应该是:

AAA, BBB = 2   
AAA, CCC = 1   
BBB, CCC = 1   
AAA, BBB, CCC = 1

4 个答案:

答案 0 :(得分:1)

不容易,因为与其他行相比,最后一行中的匹配产品数量不同。您可以使用某种GROUP_CONCAT()运算符(在MySQL中可用;可在其他DBMS中实现,例如Informix和可能是PostgreSQL)来实现,但我对此并不自信。

成对匹配

SELECT p1.product_name AS name1, p2.product_name AS name2, COUNT(*)
  FROM (SELECT p.product_name, h.transaction_id
          FROM products AS p
          JOIN transactions_has_products AS h ON h.product_id = p.product_id
       ) AS p1
  JOIN (SELECT p.product_name, h.transaction_id
          FROM products AS p
          JOIN transactions_has_products AS h ON h.product_id = p.product_id
       ) AS p2
    ON p1.transaction_id = p2.transaction_id
   AND p1.product_name   < p2.product_name
 GROUP BY p1.name, p2.name;

处理三重比赛非常重要;进一步扩展它绝对是相当困难的。

答案 1 :(得分:1)

如果你知道所有产品将在前面做什么,你可以通过像这样转动数据来做到这一点。

如果您不知道预先提供的产品,可以在存储过程中动态构建此查询。如果产品数量很大,任何一种方法的实用性都会崩溃,但我认为无论这项要求如何实现,这都可能是真的。

select
    product_combination, 
    case product_combination
        when 'AAA, BBB' then aaa_bbb
        when 'AAA, CCC' then aaa_ccc
        when 'BBB, CCC' then bbb_ccc
        when 'AAA, BBB, CCC' then aaa_bbb_ccc
    end as number_of_transactions
from
(
    select 'AAA, BBB' as product_combination union all
    select 'AAA, CCC' union all
    select 'BBB, CCC' union all
    select 'AAA, BBB, CCC'
) as combination_list
cross join
(
    select
        sum(case when aaa = 1 and bbb = 1 then 1 else 0 end) as aaa_bbb,
        sum(case when aaa = 1 and ccc = 1 then 1 else 0 end) as aaa_ccc,
        sum(case when bbb = 1 and ccc = 1 then 1 else 0 end) as bbb_ccc,
        sum(case when aaa = 1 and bbb = 1 and ccc = 1 then 1 else 0 end) as aaa_bbb_ccc
    from
    (
        select
            count(case when a.product_name = 'AAA' then 1 else null end) as aaa,
            count(case when a.product_name = 'BBB' then 1 else null end) as bbb,
            count(case when a.product_name = 'CCC' then 1 else null end) as ccc,
            b.transaction_id
        from
            products a
        inner join
            transaction_has_products b
        on
            a.id = b.product_id
        group by
            b.transaction_id
    ) as product_matrix
) as combination_counts

结果:

product_combination  number_of_transactions
AAA, BBB             2
AAA, CCC             1
BBB, CCC             1
AAA, BBB, CCC        1

答案 2 :(得分:0)

取决于您可以对查询进行多少控制(这可能是必须为postgresql更改TSQL)

SELECT COUNT(*) FROM transactions t WHERE
(
     SELECT COUNT(DISTINCT tp.product) 
     FROM transaction_has_products tp 
     WHERE tp.[transaction_id] = t.id and tp.product IN (1, 2, 3)
) = 3

其中(1,2,3)是您要检查的ID列表,= 3等于列表中的条目数量。

答案 3 :(得分:0)

  1. 生成所有可能的组合。我支持自己:https://stackoverflow.com/a/9135162/2244766(这有点棘手,我不完全理解逻辑......但它有效!)
  2. 制作子查询,将products_in_transactions聚合到每个transaction_id的产品数组中
  3. 使用数组包含运算符
  4. 加入它们

    完成上述步骤后,您可以获得以下内容:

    with all_combis as (
        with RECURSIVE y1 as (
                with x1 as (
                    --select id from products
                    select distinct product_id as a from transaction_has_products 
                )
                select array[a] as b ,a as c ,1 as d 
                from x1
                union all
                select b||a,a,d+1
                from x1
                join y1 on (a < c)
        )
        select *
        from y1
    )
    , grouped_transactions as (
      SELECT 
        array_agg(product_id) as products
      FROM transaction_has_products
      GROUP BY transaction_id
    )
    SELECT all_combis.b, count(*)
    from all_combis
    left JOIN grouped_transactions ON grouped_transactions.products @> all_combis.b 
    --WHERE array_upper(b, 1) > 1 -- or whatever
    GROUP BY all_combis.b
    order by array_upper(b, 1) desc, count(*) desc
    

    您可以加入您的产品表,用他们的名字替换产品ID - 但我猜你会从这里得到它。 here's the fiddle(sqlfiddle今天有一个糟糕的一天 - 所以请检查你的数据库,以防它抛出一些奇怪的错误,如超时或类似的东西)

    GL,HF:D