使用窗口函数将一个集合与另一个集合进行比较

时间:2019-02-15 11:05:56

标签: postgresql window-functions

以下查询(已通过Postgresql 11.1测试)针对每个客户/产品组合评估以下元素:

  • (A)客户在该产品上花费的销售价值总和
  • (B)客户在该产品的父类别中花费的销售价值总和

然后将A / B除以得出称为loyalty的指标。

select
  pp.customer, pp.product, pp.category,
  pp.sales_product / pc.sales_category as loyalty
from (
    select
      t.household_key as customer,
      t.product_id as product,
      p.commodity as category,
      sum(t.sales_value) as sales_product
    from transaction_data t
    left join product p on p.product_id = t.product_id
    group by t.household_key, t.product_id, p.commodity
) pp
left join (
    select
      t.household_key as customer,
      p.commodity as category,
      sum(t.sales_value) as sales_category
    from transaction_data t
    left join product p on p.product_id = t.product_id
    group by t.household_key, p.commodity
) pc on pp.customer = pc.customer and pp.category = pc.category
;

结果具有以下形式:

customer      product    category     loyalty
---------------------------------------------
       1       tomato        food        0.01
       1         beef        food        0.02
       1   toothpaste     hygiene        0.04
       1   toothbrush     hygiene        0.03

我的问题是,不必依赖于两个子查询然后将它们左联接,那么使用窗口函数代替单个查询是否可行?

我已经尝试执行以下操作,但是显然这是行不通的,因为在这种情况下,column "t.sales_value" must appear in the GROUP BY clause or be used in an aggregate function。我看不出该如何解决。

-- does not work
select
  t.household_key as customer,
  t.product_id as product,
  p.commodity as category,
  sum(t.sales_value) as sales_product,
  sum(t.sales_value) over (partition by t.household_key, p.commodity) as sales_category
from transaction_data t
left join product p on p.product_id = t.product_id
group by t.household_key, t.product_id, p.commodity;

1 个答案:

答案 0 :(得分:1)

我不知道如何在不使用联接或子查询的情况下执行此操作,但这是使用解析函数通过子查询执行此操作的一种方法:

WITH cte AS (
    SELECT
        t.household_key AS customer,
        t.product_id AS product,
        p.commodity as category,
        SUM(t.sales_value) OVER (PARTITION BY t.household_key, t.product_id, p.commodity)
            AS sales_product,
        SUM(t.sales_value) OVER (PARTITION BY t.household_key, p.commodity)
            AS sales_category
    FROM transaction_data t
    LEFT JOIN product p
        ON p.product_id = t.product_id
)

SELECT
    t.customer,
    t.product,
    t.category
    MAX(t.sales_product) / MAX(t.sales_category) AS loyalty
FROM cte
GROUP BY
    t.customer,
    t.product,
    t.category;

这里的窍门是对连接的表进行一次遍历,并使用解析和来计算所需的聚合,该聚合具有两个不同的分区,一个分区有2列,另一个分区有3列。然后,我们可以按3列进行汇总,并且可以任意取每个组的汇总最大值。