我正在尝试按某些数据更改时对表进行分区(不确定如何解释)
最佳示例
我有一个像这样的表(例如简化):
| ID | TEST_DATA | LOGDATETIME |
------------------------------------------------
| 1 | a | June, 19 2013 00:13:23+0000 |
| 2 | a | June, 19 2013 00:13:24+0000 |
| 3 | a | June, 19 2013 00:13:25+0000 |
| 4 | b | June, 19 2013 00:13:26+0000 |
| 5 | b | June, 19 2013 00:13:27+0000 |
| 6 | b | June, 19 2013 00:13:28+0000 |
| 7 | a | June, 19 2013 00:13:29+0000 |
| 8 | a | June, 19 2013 00:13:30+0000 |
| 9 | a | June, 19 2013 00:13:31+0000 |
我想通过测试数据进行分区(分组),如下所示:
| ID | TEST_DATA | LOGDATETIME | grouping
------------------------------------------------
| 1 | a | June, 19 2013 00:13:23+0000 | 1
| 2 | a | June, 19 2013 00:13:24+0000 | 1
| 3 | a | June, 19 2013 00:13:25+0000 | 1
| 4 | b | June, 19 2013 00:13:26+0000 | 2
| 5 | b | June, 19 2013 00:13:27+0000 | 2
| 6 | b | June, 19 2013 00:13:28+0000 | 2
| 7 | a | June, 19 2013 00:13:29+0000 | 3
| 8 | a | June, 19 2013 00:13:30+0000 | 3
| 9 | a | June, 19 2013 00:13:31+0000 | 3
我想保留日志时间的顺序,但每次TEST_DATA更改时都会创建一个新的分组
SQLFiddle:http://sqlfiddle.com/#!12/d9c17/1
答案 0 :(得分:2)
有点黑客:
SELECT id, test_data, logdatetime
,SUM(CASE WHEN test_data = prev_data THEN 0 ELSE 1 END)
OVER (ORDER BY id) + 1 AS grouping
FROM ( SELECT id, test_data, logdatetime
,COALESCE( LAG(test_data) OVER(ORDER BY id)
,test_data
) AS prev_data
FROM test t
) x
使用分析函数LAG
将伪列添加到包含前一行test_data
值的每一行。然后,每当SUM
与前一行的值不相同时,使用分析函数test_data
递增累加器。
分步骤:
postgres=# SELECT id, test_data, logdatetime
postgres-# ,COALESCE( LAG(test_data) OVER(PARTITION BY 'x' ORDER BY id)
postgres(# ,test_data) AS prev_data
postgres-# FROM test t;
id | test_data | logdatetime | prev_data
----+-----------+-------------------------+-----------
1 | a | 2013-06-19 00:13:23.184 | a
2 | a | 2013-06-19 00:13:24.312 | a
3 | a | 2013-06-19 00:13:25.184 | a
4 | b | 2013-06-19 00:13:26.184 | a
5 | b | 2013-06-19 00:13:27.184 | b
6 | b | 2013-06-19 00:13:28.184 | b
7 | a | 2013-06-19 00:13:29.184 | b
8 | a | 2013-06-19 00:13:30.184 | a
9 | a | 2013-06-19 00:13:31.184 | a
(9 rows)
postgres=# SELECT id, test_data, logdatetime
postgres-# ,CASE WHEN test_data = prev_data THEN 0 ELSE 1 END AS counter
postgres-# FROM ( SELECT id, test_data, logdatetime
postgres(# ,COALESCE( LAG(test_data) OVER(PARTITION BY 'x' ORDER BY id)
postgres(# ,test_data) AS prev_data
postgres(# FROM test t
postgres(# ) x;
id | test_data | logdatetime | counter
----+-----------+-------------------------+---------
1 | a | 2013-06-19 00:13:23.184 | 0
2 | a | 2013-06-19 00:13:24.312 | 0
3 | a | 2013-06-19 00:13:25.184 | 0
4 | b | 2013-06-19 00:13:26.184 | 1
5 | b | 2013-06-19 00:13:27.184 | 0
6 | b | 2013-06-19 00:13:28.184 | 0
7 | a | 2013-06-19 00:13:29.184 | 1
8 | a | 2013-06-19 00:13:30.184 | 0
9 | a | 2013-06-19 00:13:31.184 | 0
postgres=# SELECT id, test_data, logdatetime
postgres-# ,SUM( CASE WHEN test_data = prev_data THEN 0 ELSE 1 END )
postgres-# OVER (PARTITION BY 'x' ORDER BY id) + 1 AS grouping
postgres-# FROM ( SELECT id, test_data, logdatetime
postgres(# ,COALESCE( LAG(test_data) OVER(PARTITION BY 'x' ORDER BY id)
postgres(# ,test_data) AS prev_data
postgres(# FROM test t
postgres(# ) x;
id | test_data | logdatetime | grouping
----+-----------+-------------------------+----------
1 | a | 2013-06-19 00:13:23.184 | 1
2 | a | 2013-06-19 00:13:24.312 | 1
3 | a | 2013-06-19 00:13:25.184 | 1
4 | b | 2013-06-19 00:13:26.184 | 2
5 | b | 2013-06-19 00:13:27.184 | 2
6 | b | 2013-06-19 00:13:28.184 | 2
7 | a | 2013-06-19 00:13:29.184 | 3
8 | a | 2013-06-19 00:13:30.184 | 3
9 | a | 2013-06-19 00:13:31.184 | 3
(9 rows)