POSTGRESQL:如何选择每个组的第一行?

时间:2017-05-04 14:35:54

标签: sql postgresql greatest-n-per-group

使用此查询:

WITH responsesNew AS
(
  SELECT DISTINCT responses."studentId", notation, responses."givenHeart", 
  SUM(notation + responses."givenHeart") OVER (partition BY responses."studentId" 
  ORDER BY responses."createdAt") AS total, responses."createdAt",  
  FROM responses
)
SELECT responsesNew."studentId", notation, responsesNew."givenHeart", total, 
responsesNew."createdAt"
FROM responsesNew
WHERE total = 3
GROUP BY responsesNew."studentId", notation, responsesNew."givenHeart", total, 
responsesNew."createdAt"
ORDER BY responsesNew."studentId" ASC

我得到了这个数据表:

studentId | notation | givenHeart | total |      createdAt     |
----------+----------+------------+-------+--------------------+
 374      | 1        | 0          | 3     | 2017-02-13 12:43:03   
 374      | null     | 0          | 3     | 2017-02-15 22:22:17
 639      | 1        | 2          | 3     | 2017-04-03 17:21:30 
 790      | 1        | 0          | 3     | 2017-02-12 21:12:23
 ...

我的目标是只在我的数据表中保留每个组的早期行,如下所示:

studentId | notation | givenHeart | total |      createdAt     |
----------+----------+------------+-------+--------------------+
 374      | 1        | 0          | 3     | 2017-02-13 12:43:03 
 639      | 1        | 2          | 3     | 2017-04-03 17:21:30 
 790      | 1        | 0          | 3     | 2017-02-12 21:12:23
 ...

我怎样才能到达那里?

我已经在这里阅读了很多主题,但我曾尝试使用DISTINCTDISTINCT ONWHERELIMIT等子查询进行了尝试为我工作(当然由于我的理解不足)。我遇到了与窗口功能相关的错误,错过了ORDER BY中的列以及其他一些我无法记住的内容。

2 个答案:

答案 0 :(得分:0)

您可以使用distinct on执行此操作。查询将如下所示:

WITH responsesNew AS (
      SELECT DISTINCT r."studentId", notation, r."givenHeart", 
             SUM(notation + r."givenHeart") OVER (partition BY r."studentId" 
                                                  ORDER BY r."createdAt") AS total,
             r."createdAt" 
      FROM responses r
     )
SELECT DISTINCT ON (r."studentId") r."studentId", notation, r."givenHeart", total, 
r."createdAt"
FROM responsesNew r
WHERE total = 3
ORDER BY r."studentId" ASC, r."createdAt";

我很确定这可以简化。我只是不明白CTE的目的。以这种方式使用SELECT DISTINCT非常好奇。

如果您想要简化查询,请向另一个问题询问样本数据,所需结果以及您正在做什么的说明,并包含查询或此问题的链接。

答案 1 :(得分:0)

使用Row_number()窗口函数为每个分区添加行号,然后只显示第1行。

如果只涉及一个表,则无需完全限定名称。在合格时使用别名以简化可读性。

WITH responsesNew AS
(
  SELECT "studentId"
       , notation
       , "givenHeart"
       , SUM(notation + "givenHeart") OVER (partition BY "studentId" ORDER BY "createdAt") AS total
       , "createdAt"
       , Row_number() OVER ("studentId" ORDER BY "createdAt") As RNum
  FROM responses r
)
SELECT RN."studentId"
     , notation, RN."givenHeart"
     , total
     , RN."createdAt"
FROM responsesNew RN
WHERE total = 3
  AND RNum = 1
GROUP BY RN."studentId"
       , notation
       , RN."givenHeart", total
       , RN."createdAt"
ORDER BY RN."studentId" ASC