比较和分组行SQL结果POSTGRES

时间:2014-08-25 19:52:11

标签: sql postgresql aggregate-functions window-functions

我有一个包含寄存器列表的表,其中每个寄存器代表一个事件, 我必须将这些寄存器分组在同一天。 此外,我必须在当天对事件进行分组,按id_fertilizer和calibrado分组,但我无法将第一个结果与最后一个结果分组

我的SQL结果是这样的:

    work_date     id_fertilizer calibrado area   begin_date    end_date
[1] '2014-07-22'  43            NULL      0      "07:03:42.0"  "07:08:00.0"
[2] '2014-07-22'  49            NULL      0      "07:08:52.0"  "07:44:04.0"
[3] '2014-07-22'  49            true      54101  "07:49:41.0"  "12:00:05.0"
[4] '2014-07-22'  49            true      4893   "12:00:30.0"  "14:06:13.0"
[5] '2014-07-22'  43            NULL      0      "14:06:51.0"  "14:49:30.0"
[6] '2014-07-22'  43            NULL      12397  "14:50:04.0"  "16:12:03"

我必须将第3行和第4行分为一行,将第5行和第6行分为另一行,取min() begin_date 和max() end_date 和sum()区域 第1行和第2行是独立的行。 最后我必须有4行:1,2,(3 + 4),(5 + 6)

我得到的结果是这样的(使用函数窗口),但这是错误的,因为它将行1,5,6分成一行:

work_date                id_fertilizer  calibrado  area  begin_date    end_date
"2014-07-22 00:00:00.0"  43             NULL       1     "07:03:42.0"  "16:12:03.0"
"2014-07-22 00:00:00.0"  49             NULL       0     "07:08:52.0"  "07:44:04.0"
"2014-07-22 00:00:00.0"  49             true       5     "07:49:41.0"  "14:06:13.0"

我的结果,有3行表示这些人从07:03:42到16:12:03与id_fertilizer 43一起工作,但也说他在07:08:52 07:44:04与id_fertilizer 49一起工作。 这没有意义,我必须尊重事件的时间顺序。 所以,我期待的结果是:

      work_date     id_fertilizer  calibrado  area    begin_date    end_date
[1]   '2014-07-22'  43             NULL       0       "07:03:42.0"  "07:08:00.0"
[2]   '2014-07-22'  49             NULL       0       "07:08:52.0"  "07:44:04.0"
[3]   '2014-07-22'  49             true       58994   "07:49:41.0"  "14:06:13.0"
[4]   '2014-07-22'  43             NULL       12397   "14:06:51.0"  "16:12:03"

1 个答案:

答案 0 :(得分:0)

一个有根据的猜测:

SELECT work_date, id_fertilizer, calibrado
     , sum(area)       AS area
     , min(begin_date) AS begin_date
     , max(end_date)   AS end_date
FROM  (
   SELECT *
     , count(step) OVER (ORDER BY work_date, id_fertilizer, begin_date) AS grp
   FROM  (
      SELECT *
           , lag(calibrado) OVER (ORDER BY work_date, id_fertilizer, begin_date)
             IS DISTINCT FROM calibrado AS step
      FROM   tbl
      ) sub1
   ) sub2
GROUP  BY work_date, id_fertilizer, calibrado, grp
ORDER  BY work_date, id_fertilizer, calibrado, grp;

这会产生您的更新结果。

相关答案以及更多解释: