我有一张桌子
id | volume_id| ... |
----+----------+-----+
1 | 1 | ... |
2 | 2 | ... |
3 | 1 | ... |
4 | 3 | ... |
5 | 2 | ... |
...
我可以做一个简单的分组查询:
select volume_id, count(*), min(id) as min_id, max(id) as max_id
from my_table
group by volume_id;
哪会产生结果:
volume_id | count | min_id | max_id
-----------+-------+--------+--------
1 | 67330 | ... | ...
2 | 67330 | ... | ...
3 | 67330 | ... | ...
4 | 67330 | ... | ...
但我想将结果分成40K行的组。所以结果应该是这样的:
volume_id | count | min_id | max_id
-----------+-------+--------+--------
1 | 40000 | ... | ... <- first group of IDs for volume 1
1 | 27330 | ... | ... <- second group of IDs for volume 1
2 | 40000 | ... | ...
2 | 27330 | ... | ...
3 | 40000 | ... | ...
4 | 27330 | ... | ...
ID应该被拆分,以便第一组的max_id
小于第二组的min_id
,依此类推。
如果有人知道如何编写这样的查询(或plsql函数,如果没有其他方法),我将不胜感激。
我正在使用Postgresql 9.5。
答案 0 :(得分:4)
您可以使用rank()
(或row_number()
,如果没有重复项)来枚举这些组。然后是group by
中的简单算术:
select volume_id, count(*), min(id) as min_id, max(id) as max_id
from (select t.*,
rank() over (partition by volume_id order by id) as seqnum
from my_table t
) t
group by volume_id, floor((seqnum - 1) / 40000)
order by volume_id, min(id);