我有下表,
id id_concept start_date end_date
----------------------------------------------
100 282 20/06/2016 24/06/2016
100 282 15/07/2016 18/07/2016
300 282 01/09/2016 02/09/2016
我需要将具有相同ID,id_concept的记录和一个记录的END_DATE之间的时间与下一个记录的START_DATE之间的时间合并为30天或更短(< = 30)
对于组合记录,我需要将start_date作为记录的第一个start_date,并将end_date作为最后一个记录的end_date
o / p应该是,
id id_concept start_date end_date count
---------------------------------------------------
100 282 20/06/2016 18/07/2016 2
300 282 01/09/2016 02/09/2016 1
答案 0 :(得分:1)
PostgreSQL 9.3架构设置:
CREATE TABLE table_name ( id, id_concept, start_date, end_date ) AS
SELECT 100, 282, DATE '2016-06-20', DATE '2016-06-24' UNION ALL
SELECT 100, 282, DATE '2016-07-15', DATE '2016-07-18' UNION ALL
SELECT 300, 282, DATE '2016-09-01', DATE '2016-09-02';
查询1 :
SELECT id,
id_concept,
MIN( start_date ) AS start_date,
MAX( end_date ) AS end_date,
COUNT(*) AS "count"
FROM (
SELECT id,
id_concept,
start_date,
end_date,
SUM( diff ) OVER (
PARTITION BY id, id_concept
ORDER BY start_date, end_date
) AS grp
FROM (
SELECT t.*,
CASE
WHEN LAG( end_date ) OVER (
PARTITION BY id, id_concept
ORDER BY start_date, end_date
) >= start_date - INTERVAL '30' DAY
THEN 0
ELSE 1
END AS diff
FROM table_name t
) t
) t
GROUP BY id, id_concept, grp
<强> Results 强>:
| id | id_concept | start_date | end_date | count |
|-----|------------|-----------------------------|-----------------------------|-------|
| 300 | 282 | September, 01 2016 00:00:00 | September, 02 2016 00:00:00 | 1 |
| 100 | 282 | June, 20 2016 00:00:00 | July, 18 2016 00:00:00 | 2 |