我有一个包含寄存器列表的表,其中每个寄存器代表一个事件, 我必须将这些寄存器分组在同一天。 此外,我必须在当天对事件进行分组,按id_fertilizer和calibrado分组,但我无法将第一个结果与最后一个结果分组。
我的SQL结果是这样的:
work_date id_fertilizer calibrado area begin_date end_date
[1] '2014-07-22' 43 NULL 0 "07:03:42.0" "07:08:00.0"
[2] '2014-07-22' 49 NULL 0 "07:08:52.0" "07:44:04.0"
[3] '2014-07-22' 49 true 54101 "07:49:41.0" "12:00:05.0"
[4] '2014-07-22' 49 true 4893 "12:00:30.0" "14:06:13.0"
[5] '2014-07-22' 43 NULL 0 "14:06:51.0" "14:49:30.0"
[6] '2014-07-22' 43 NULL 12397 "14:50:04.0" "16:12:03"
我必须将第3行和第4行分为一行,将第5行和第6行分为另一行,取min() begin_date 和max() end_date 和sum()区域 第1行和第2行是独立的行。 最后我必须有4行:1,2,(3 + 4),(5 + 6)
我得到的结果是这样的(使用函数窗口),但这是错误的,因为它将行1,5,6分成一行:
work_date id_fertilizer calibrado area begin_date end_date
"2014-07-22 00:00:00.0" 43 NULL 1 "07:03:42.0" "16:12:03.0"
"2014-07-22 00:00:00.0" 49 NULL 0 "07:08:52.0" "07:44:04.0"
"2014-07-22 00:00:00.0" 49 true 5 "07:49:41.0" "14:06:13.0"
我的结果,有3行表示这些人从07:03:42到16:12:03与id_fertilizer 43一起工作,但也说他在07:08:52 07:44:04与id_fertilizer 49一起工作。 这没有意义,我必须尊重事件的时间顺序。 所以,我期待的结果是:
work_date id_fertilizer calibrado area begin_date end_date
[1] '2014-07-22' 43 NULL 0 "07:03:42.0" "07:08:00.0"
[2] '2014-07-22' 49 NULL 0 "07:08:52.0" "07:44:04.0"
[3] '2014-07-22' 49 true 58994 "07:49:41.0" "14:06:13.0"
[4] '2014-07-22' 43 NULL 12397 "14:06:51.0" "16:12:03"
答案 0 :(得分:0)
一个有根据的猜测:
SELECT work_date, id_fertilizer, calibrado
, sum(area) AS area
, min(begin_date) AS begin_date
, max(end_date) AS end_date
FROM (
SELECT *
, count(step) OVER (ORDER BY work_date, id_fertilizer, begin_date) AS grp
FROM (
SELECT *
, lag(calibrado) OVER (ORDER BY work_date, id_fertilizer, begin_date)
IS DISTINCT FROM calibrado AS step
FROM tbl
) sub1
) sub2
GROUP BY work_date, id_fertilizer, calibrado, grp
ORDER BY work_date, id_fertilizer, calibrado, grp;
这会产生您的更新结果。
相关答案以及更多解释: