SQL选择列上具有最大差异的行

时间:2018-11-02 23:29:31

标签: sql postgresql subquery greatest-n-per-group window-functions

我有两个Postgres表,如下所示,分别称为client和order。

id | name
------------
41 | james
29 | melinda
36 | henry
...

id | date | volume | client_id
------------------------------
328 | 2018-01-03 | 16 | 41
411 | 2018-01-29 | 39 | 29
129 | 2018-01-13 | 73 | 29
542 | 2018-01-22 | 62 | 36
301 | 2018-01-17 | 38 | 41
784 | 2018-01-08 | 84 | 29
299 | 2018-01-10 | 54 | 36
300 | 2018-01-10 | 18 | 36
178 | 2018-01-30 | 37 | 36
...

然后我用以下逻辑编写了一个查询:

i)查找每个订单与其上一个订单之间的数量差异(按客户和按日期分组)。对于第一笔订单,此列为空。

ii)显示客户以及每个客户的最大数量差异。

with cte AS
  (SELECT t.name,
          t.date,
          t.volume,
          t.volume - lag(t.volume) over (
                                         ORDER BY t.name, t.date) AS change
   FROM
     (SELECT c.name,
             o.date,
             sum(o.volume) volume
      FROM orders o
      JOIN client c using (client_id)
      GROUP BY c.name,
               o.date
      ORDER BY c.name,
               o.date) t)
SELECT cte.name,
       max(abs(change))
FROM cte
GROUP BY name

下面是结果表。

name | max
------------
james   | 22
melinda | 34
henry   | 25

我正在寻求关于三件事的建议。

a)是否可以用符号显示差异?对于客户端ID 29和36,其值应分别为-34和-25。

b)是否也可以显示日期?我试图在CTE上选择日期列,但没有成功。

c)是否有人对我如何改进查询以使其更具性能或可读性有任何一般性建议?

2 个答案:

答案 0 :(得分:1)

Postgres支持DISTINCT ON,这确实方便了您的查询。此外,由于窗口函数和聚合函数可以在同一级别上组合,因此可以简化查询:

SELECT DISTINCT ON (co.name) co.*
FROM (SELECT c.name, o.date, SUM(o.volume) as volume,
             LAG(SUM(o.volume)) OVER (PARTITION BY c.name ORDER BY o.date) as prev_volume
      FROM orders o JOIN
           client c 
           USING (client_id)
      GROUP BY c.name, o.date
     ) co
ORDER BY c.name, ABS(volume - prev_volume) DESC

答案 1 :(得分:0)

此查询的大部分内容都对Gordon Linoff表示感谢,但是需要一些调整,尤其是下面的where子句。

SELECT DISTINCT ON (co.name) 
       co.client_id
     , co.name
     , co.DATE
     , (co.volume - co.prev_volume) change
FROM (
     SELECT 
            c.client_id
          , c.name
          , o.DATE
          , SUM(o.volume) AS volume
          , LAG(SUM(o.volume)) OVER (PARTITION BY c.name 
                                     ORDER BY o.DATE) AS prev_volume
     FROM orders o
     INNER JOIN client c USING (client_id)
     GROUP BY
           c.client_id
          , c.name
          , o.DATE
     ) co
WHERE prev_volume IS NOT NULL
ORDER BY
       co.name
     , ABS(co.volume - co.prev_volume)  DESC

结果:

client_id | name    | date       | change
--------: | :------ | :--------- | -----:
       36 | henry   | 2018-01-30 |    -25
       41 | james   | 2018-01-17 |     22
       29 | melinda | 2018-01-29 |    -34

db <>提琴here