修复历史数据

时间:2019-07-09 14:19:57

标签: sql-server

我有一个这样的历史表格:

+------------------+------------------+---------+-------+
| valid_from       | valid_to         |  Profit | ID    |
+------------------+------------------+---------+-------+
| 20.05.2019 00:02 | 22.05.2019 23:42 |      10 | 12345 |
| 22.05.2019 23:42 | 28.05.2019 13:11 |      10 | 12345 |
| 28.05.2019 13:11 | 28.05.2019 23:59 |      10 | 12345 |
| 28.05.2019 23:59 | 29.05.2019 06:48 |     123 | 12345 |
| 29.05.2019 06:48 | 29.05.2019 13:21 |     123 | 12345 |
| 29.05.2019 13:21 | 29.05.2019 23:59 |     123 | 12345 |
| 29.05.2019 23:59 | 30.05.2019 06:39 |      10 | 12345 |
| 30.05.2019 06:39 | 30.05.2019 12:37 |     123 | 12345 |
| 30.05.2019 12:37 | 31.05.2019 00:09 |     123 | 12345 |
| 31.05.2019 00:09 | 31.05.2019 08:41 |     145 | 12345 |
| 31.05.2019 08:41 | 01.06.2019 00:22 |     145 | 12345 |
+------------------+------------------+---------+-------+

我删除了一些列。现在可以汇总第1、2和3行。

起初,我尝试遵循GROUP-BY语句:

SELECT MIN(valid_from   ) AS valid_from 
      ,MAX(valid_to ) AS valid_to   
      ,Profit
      ,ID
INTO [repaired_archiv]
FROM temp.[wrong_archiv]
GROUP BY Profit
        ,ID

结果是:

+------------------+------------------+---------+-------+
| valid_from       | valid_to         |  Profit | ID    |
+------------------+------------------+---------+-------+
| 20.05.2019 00:02 | 30.05.2019 06:39 |      10 | 12345 |
| 28.05.2019 23:59 | 31.05.2019 00:09 |     123 | 12345 |
| 31.05.2019 00:09 | 01.06.2019 00:22 |     145 | 12345 |
+------------------+------------------+---------+-------+

但是如您所见,第一行中的valid_to列错了。原因是错误的GROUP-BY语句。我不知道如何得到这样的结果:

+------------------+------------------+---------+-------+
| valid_from       | valid_to         |  Profit | ID    |
+------------------+------------------+---------+-------+
| 20.05.2019 00:02 | 28.05.2019 23:59 |      10 | 12345 |
| 28.05.2019 23:59 | 29.05.2019 23:59 |     123 | 12345 |
| 29.05.2019 23:59 | 30.05.2019 06:39 |      10 | 12345 |
| 30.05.2019 06:39 | 31.05.2019 00:09 |     123 | 12345 |
| 31.05.2019 00:09 | 01.06.2019 00:22 |     145 | 12345 |
+------------------+------------------+---------+-------+

1 个答案:

答案 0 :(得分:2)

您需要两个row_number()

select min(valid_from) as valid_from, max(valid_to) as valid_to, id, profit
from (select t.*, 
             row_number() over (order by valid_from) as seq1,
             row_number() over (partition by id, profit order by valid_from) as seq2
      from temp.[wrong_archiv] t
     ) t
group by id, profit, (seq1 - seq2)
order by valid_from;