数据更改的分区(组)表

时间:2013-06-19 02:21:19

标签: postgresql

我正在尝试按某些数据更改时对表进行分区(不确定如何解释)

最佳示例

我有一个像这样的表(例如简化):

| ID | TEST_DATA |                 LOGDATETIME |
------------------------------------------------
|  1 |         a | June, 19 2013 00:13:23+0000 |
|  2 |         a | June, 19 2013 00:13:24+0000 |
|  3 |         a | June, 19 2013 00:13:25+0000 |
|  4 |         b | June, 19 2013 00:13:26+0000 |
|  5 |         b | June, 19 2013 00:13:27+0000 |
|  6 |         b | June, 19 2013 00:13:28+0000 |
|  7 |         a | June, 19 2013 00:13:29+0000 |
|  8 |         a | June, 19 2013 00:13:30+0000 |
|  9 |         a | June, 19 2013 00:13:31+0000 |

我想通过测试数据进行分区(分组),如下所示:

| ID | TEST_DATA |                 LOGDATETIME | grouping
------------------------------------------------
|  1 |         a | June, 19 2013 00:13:23+0000 | 1
|  2 |         a | June, 19 2013 00:13:24+0000 | 1
|  3 |         a | June, 19 2013 00:13:25+0000 | 1
|  4 |         b | June, 19 2013 00:13:26+0000 | 2
|  5 |         b | June, 19 2013 00:13:27+0000 | 2
|  6 |         b | June, 19 2013 00:13:28+0000 | 2
|  7 |         a | June, 19 2013 00:13:29+0000 | 3
|  8 |         a | June, 19 2013 00:13:30+0000 | 3
|  9 |         a | June, 19 2013 00:13:31+0000 | 3

我想保留日志时间的顺序,但每次TEST_DATA更改时都会创建一个新的分组

SQLFiddle:http://sqlfiddle.com/#!12/d9c17/1

1 个答案:

答案 0 :(得分:2)

有点黑客:

SELECT id, test_data, logdatetime
      ,SUM(CASE WHEN test_data = prev_data THEN 0 ELSE 1 END)
          OVER (ORDER BY id) + 1 AS grouping

  FROM ( SELECT id, test_data, logdatetime
               ,COALESCE( LAG(test_data) OVER(ORDER BY id)
                         ,test_data
                        ) AS prev_data
           FROM test t
       ) x

使用分析函数LAG将伪列添加到包含前一行test_data值的每一行。然后,每当SUM与前一行的值不相同时,使用分析函数test_data递增累加器。

分步骤:

postgres=# SELECT id, test_data, logdatetime
postgres-#       ,COALESCE( LAG(test_data) OVER(PARTITION BY 'x' ORDER BY id)
postgres(#       ,test_data) AS prev_data
postgres-#   FROM test t;
 id | test_data |       logdatetime       | prev_data
----+-----------+-------------------------+-----------
  1 | a         | 2013-06-19 00:13:23.184 | a
  2 | a         | 2013-06-19 00:13:24.312 | a
  3 | a         | 2013-06-19 00:13:25.184 | a
  4 | b         | 2013-06-19 00:13:26.184 | a
  5 | b         | 2013-06-19 00:13:27.184 | b
  6 | b         | 2013-06-19 00:13:28.184 | b
  7 | a         | 2013-06-19 00:13:29.184 | b
  8 | a         | 2013-06-19 00:13:30.184 | a
  9 | a         | 2013-06-19 00:13:31.184 | a
(9 rows)

postgres=# SELECT id, test_data, logdatetime
postgres-#       ,CASE WHEN test_data = prev_data THEN 0 ELSE 1 END AS counter
postgres-#   FROM  ( SELECT id, test_data, logdatetime
postgres(#                 ,COALESCE( LAG(test_data) OVER(PARTITION BY 'x' ORDER BY id)
postgres(#                 ,test_data) AS prev_data
postgres(#             FROM test t
postgres(#         ) x;
 id | test_data |       logdatetime       | counter
----+-----------+-------------------------+---------
  1 | a         | 2013-06-19 00:13:23.184 |       0
  2 | a         | 2013-06-19 00:13:24.312 |       0
  3 | a         | 2013-06-19 00:13:25.184 |       0
  4 | b         | 2013-06-19 00:13:26.184 |       1
  5 | b         | 2013-06-19 00:13:27.184 |       0
  6 | b         | 2013-06-19 00:13:28.184 |       0
  7 | a         | 2013-06-19 00:13:29.184 |       1
  8 | a         | 2013-06-19 00:13:30.184 |       0
  9 | a         | 2013-06-19 00:13:31.184 |       0


postgres=# SELECT id, test_data, logdatetime
postgres-#       ,SUM( CASE WHEN test_data = prev_data THEN 0 ELSE 1 END )
postgres-#            OVER (PARTITION BY 'x' ORDER BY id) + 1 AS grouping
postgres-#   FROM ( SELECT id, test_data, logdatetime
postgres(#                ,COALESCE( LAG(test_data) OVER(PARTITION BY 'x' ORDER BY id)
postgres(#                ,test_data) AS prev_data
postgres(#            FROM test t
postgres(#        ) x;
 id | test_data |       logdatetime       | grouping
----+-----------+-------------------------+----------
  1 | a         | 2013-06-19 00:13:23.184 |        1
  2 | a         | 2013-06-19 00:13:24.312 |        1
  3 | a         | 2013-06-19 00:13:25.184 |        1
  4 | b         | 2013-06-19 00:13:26.184 |        2
  5 | b         | 2013-06-19 00:13:27.184 |        2
  6 | b         | 2013-06-19 00:13:28.184 |        2
  7 | a         | 2013-06-19 00:13:29.184 |        3
  8 | a         | 2013-06-19 00:13:30.184 |        3
  9 | a         | 2013-06-19 00:13:31.184 |        3
(9 rows)