需要一些关于SQL的提示/建议(WITH命令,可能还有一些优化)

时间:2017-02-06 19:47:07

标签: postgresql

我正在寻找从PostgreSQL中的GROUP BY中选择第一项的方法,直到找到此stackoverflow:Select first row in each GROUP BY group?

在那里,我看到使用了WITH命令。 我试图了解更多" advanced" SQL的命令,如PARTITIONWITHROW_NUMBER等。直到两三个月前,我才知道基本命令(SELECTINNER JOINLEFT JOINORDER BYGROUP BY等);

我有一点问题(已解决,但我不知道这是否是更好的方法*)。

*更好的方式=我更关心干净的SQL代码而不是性能 - 这仅适用于每天执行一次且不超过5000条记录的报告。

我在PostgreSQL中有两个表:

+----------------------------------------------+
| TABLE NAME: point                            |
+--------+---------------+----------+----------+
|     km |      globalid |      lat |     long |
+--------+---------------+----------+----------+
|  36600 | 1553E2AB-B2F8 | -1774.44 | -5423.58 |
| 364000 | 25EB2465-1B8A | -1773.42 | -5422.03 |
| 362000 | 5FFDE611-88DF | -1771.80 | -5420.37 |
+--------+---------------+----------+----------+


+---------------------------------------------------------+
| TABLE NAME: photo                                       |
+--------------+---------------+------------+-------------+
| attachmentid |  rel_globalid |       date |    filename |
+--------------+---------------+------------+-------------+
|            1 | 1553E2AB-B2F8 | 2015-02-24 | photo01.jpg |
|            2 | 1553E2AB-B2F8 | 2015-02-24 | photo02.jpg |
|          405 | 25EB2465-1B8A | 2015-02-12 | photo03.jpg |
|          406 | 25EB2465-1B8A | 2015-02-12 | photo04.jpg |
|          407 | 25EB2465-1B8A | 2015-02-13 | photo06.jpg |
|            3 | 5FFDE611-88DF | 2015-02-12 | photo07.jpg |
+--------------+---------------+------------+-------------+

所以,对于这个问题:

每个point都有一张或多张照片,但我只需要point数据,第一最后< / strong> photo。如果point只有一个photo,我只需要第一个photo。如果point有三个photos,我只需要第一个和第三个photo

所以,我是如何解决的:

首先,我需要每个photo的第一个point,因此,我按rel_globalid分组,并按照组对每张照片进行编号:

WITH photos_numbered AS (
    SELECT
      rel_globalid,
      date,
      filename,
      ROW_NUMBER()
      OVER (
        PARTITION BY rel_globalid
        ORDER BY date
      ) AS photo_num
    FROM
      photo
)

使用此代码,我也可以获得第2个,第3个等等。

好的,现在,我想拍第一张照片(仍然使用上面的WITH):

SELECT *
FROM
  photos_numbered
WHERE
  photo_num = 1

为了获得最后一张照片,我使用了以下SQL:

SELECT
  p1.*
FROM
  photos_numbered p1
JOIN (
  SELECT
    rel_globalid,
    max(photo_num) photo_num
  FROM
    photos_numbered
  GROUP BY
    rel_globalid
  ) p2
  ON
    p1.rel_globalid = p2.rel_globalid AND
    p1.photo_num = p2.photo_num
WHERE
  p1.photo_num > 1

WHERE p1.photo_num > 1是因为如果point只有一个photo,则此photo将显示为第一张照片,最后一张照片将显示为NULL。< / p>

好的,现在我必须&#34;转换&#34;第一个SELECT的{​​{1}}和photo的最后一个photo,并为第一个WITH做一个简单SELECT {最后INNER JOIN代表{1}}和photo

LEFT JOIN

我认为这个SQL对于一个简单的事情来说是巨大的!

有效吗?是的,但是我想要一些建议,一些我可以更好地阅读和理解的文档,一些可以用来制作一个更好的&#34; SQL(就像我说的那样,大约两三个月前我甚至不知道photoWITH photos_numbered AS ( SELECT rel_globalid, date, filename, ROW_NUMBER() OVER ( PARTITION BY rel_globalid ORDER BY date ) AS photo_num FROM photo ), first_photo AS ( SELECT * FROM photos_numbered WHERE photo_num = 1 ), last_photo AS ( SELECT p1.* FROM photos_numbered p1 JOIN ( SELECT rel_globalid, max(photo_num) photo_num FROM photos_numbered GROUP BY rel_globalid ) p2 ON p1.rel_globalid = p2.rel_globalid AND p1.photo_num = p2.photo_num WHERE p1.photo_num > 1 ) SELECT DISTINCT point.km, point.globalid, point.lat, point."long", first_photo.date AS fp_date, first_photo.filename AS fp_filename, last_photo.date AS lp_date, last_photo.filename AS lp_filename FROM point INNER JOIN first_photo ON first_photo.rel_globalid = point.globalid LEFT JOIN last_photo ON last_photo.rel_globalid = point.globalid ORDER BY km 命令。

我试图在这里为SQLFiddle添加一个链接,但SQLFiddle从来没有为我工作(总是返回&#39; oops&#39;消息)。

2 个答案:

答案 0 :(得分:2)

如果您正在寻找干净的SQL,那么尝试使用lateral_ left和first_value以及last_value窗口函数,而不是公共表表达式(WITH子句):

select *
from point po
left join lateral 
(
   select first_value( date )     over( order by ph.date) as first_photo_date,
          first_value( filename ) over( order by ph.date) as first_photo_filename,
          last_value( date )      over( order by ph.date) as last_photo_date,
          last_value( filename )  over( order by ph.date) as last_photo_filename    
   from photo ph
   where po.globalid = ph.rel_globalid 
   limit 1
) q on true
;

当只有一条记录时,带有案例表达式的附加count(*) over()可用于“清理”上一张照片的值:

select *
from point po
left join lateral 
(
   select first_value( date )     over( order by ph.date) as first_photo_date,
          first_value( filename ) over( order by ph.date) as first_photo_filename,
          case when count(*) over () > 1 
               then last_value( date )    over( order by ph.date)
          end as last_photo_date,
          case when count(*) over () > 1 
                then last_value( filename )  over( order by ph.date) 
          end as last_photo_filename    
   from photo ph
   where po.globalid = ph.rel_globalid 
   limit 1
) q on true
;

答案 1 :(得分:0)

使用krokodilko的答案,我创建了一个没有LEFT JOIN LATERAL的新SQL查询,因为我使用的是PostgreSQL 9.2(没有LEFT JOIN LATERAL)。

SELECT DISTINCT
  po.km,
  po.globalid,
  po.lat,
  po."long",
  ph.fp_date,
  ph.fp_filename,
  ph.lp_date,
  ph.lp_filename
FROM
  point po
INNER JOIN
  (
    SELECT DISTINCT
      rel_globalid,
      first_value(date) OVER (PARTITION BY ph.rel_globalid) AS fp_date,
      first_value(filename) OVER (PARTITION BY ph.rel_globalid) AS fp_filename,
      CASE WHEN count(*) OVER (PARTITION BY ph.rel_globalid) > 1 THEN 
        last_value(date) OVER (PARTITION BY ph.rel_globalid)
      END AS lp_date,
      CASE WHEN count(*) OVER (PARTITION BY ph.rel_globalid) > 1 THEN 
        last_value(filename) OVER (PARTITION BY ph.rel_globalid)
      END AS lp_filename
    FROM
      photo ph
    ORDER BY
      rel_globalid
  ) ph
  ON ph.rel_globalid = po.globalid

OVER (PARTITION)

中几乎每个field中只有我不喜欢的INNER JOIN