我有一个这样的历史表格:
+------------------+------------------+---------+-------+
| valid_from | valid_to | Profit | ID |
+------------------+------------------+---------+-------+
| 20.05.2019 00:02 | 22.05.2019 23:42 | 10 | 12345 |
| 22.05.2019 23:42 | 28.05.2019 13:11 | 10 | 12345 |
| 28.05.2019 13:11 | 28.05.2019 23:59 | 10 | 12345 |
| 28.05.2019 23:59 | 29.05.2019 06:48 | 123 | 12345 |
| 29.05.2019 06:48 | 29.05.2019 13:21 | 123 | 12345 |
| 29.05.2019 13:21 | 29.05.2019 23:59 | 123 | 12345 |
| 29.05.2019 23:59 | 30.05.2019 06:39 | 10 | 12345 |
| 30.05.2019 06:39 | 30.05.2019 12:37 | 123 | 12345 |
| 30.05.2019 12:37 | 31.05.2019 00:09 | 123 | 12345 |
| 31.05.2019 00:09 | 31.05.2019 08:41 | 145 | 12345 |
| 31.05.2019 08:41 | 01.06.2019 00:22 | 145 | 12345 |
+------------------+------------------+---------+-------+
我删除了一些列。现在可以汇总第1、2和3行。
起初,我尝试遵循GROUP-BY语句:
SELECT MIN(valid_from ) AS valid_from
,MAX(valid_to ) AS valid_to
,Profit
,ID
INTO [repaired_archiv]
FROM temp.[wrong_archiv]
GROUP BY Profit
,ID
结果是:
+------------------+------------------+---------+-------+
| valid_from | valid_to | Profit | ID |
+------------------+------------------+---------+-------+
| 20.05.2019 00:02 | 30.05.2019 06:39 | 10 | 12345 |
| 28.05.2019 23:59 | 31.05.2019 00:09 | 123 | 12345 |
| 31.05.2019 00:09 | 01.06.2019 00:22 | 145 | 12345 |
+------------------+------------------+---------+-------+
但是如您所见,第一行中的valid_to列错了。原因是错误的GROUP-BY语句。我不知道如何得到这样的结果:
+------------------+------------------+---------+-------+
| valid_from | valid_to | Profit | ID |
+------------------+------------------+---------+-------+
| 20.05.2019 00:02 | 28.05.2019 23:59 | 10 | 12345 |
| 28.05.2019 23:59 | 29.05.2019 23:59 | 123 | 12345 |
| 29.05.2019 23:59 | 30.05.2019 06:39 | 10 | 12345 |
| 30.05.2019 06:39 | 31.05.2019 00:09 | 123 | 12345 |
| 31.05.2019 00:09 | 01.06.2019 00:22 | 145 | 12345 |
+------------------+------------------+---------+-------+
答案 0 :(得分:2)
您需要两个row_number()
:
select min(valid_from) as valid_from, max(valid_to) as valid_to, id, profit
from (select t.*,
row_number() over (order by valid_from) as seq1,
row_number() over (partition by id, profit order by valid_from) as seq2
from temp.[wrong_archiv] t
) t
group by id, profit, (seq1 - seq2)
order by valid_from;