postgresql-一旦行值更改,查询中的“ group_id”列?

时间:2019-09-24 05:39:06

标签: sql postgresql

我有以下架构和示例数据(示例):

with t
as
(
    select 1::int id , 10.01::numeric price_from , 100::int buyers ,100::int sellers ,  10.02::numeric price_to
        union
    select 2::int id , 10.01::numeric price_from , 100::int buyers ,200::int sellers ,  10.02::numeric price_to
        union
    select 3::int id , 10.01::numeric price_from , 500::int buyers ,100::int sellers ,  10.02::numeric price_to
        union
    select 4::int id , 10.01::numeric price_from , 100::int buyers ,100::int sellers ,  10.03::numeric price_to
        union
    select 5::int id , 10.01::numeric price_from , 300::int buyers ,100::int sellers ,  10.03::numeric price_to
        union
    select 6::int id , 10.01::numeric price_from , 100::int buyers ,200::int sellers ,  10.02::numeric price_to
        union
    select 7::int id , 10.01::numeric price_from , 500::int buyers ,100::int sellers ,  10.02::numeric price_to
    order by 1
)
select * 
from t

我正在尝试为具有相同from_price和to_price的行分配一个“组ID”,这样我就可以对其间隔和示例中省略的其他字段进行计算。

说明:

如果价格值再次出现在下一行中,则它应该具有新的组ID,因此我不能仅按价格进行汇总。

价格变动是我所关心的。价格有两个变化,创建了三个组(即使第三组中的价格与第一组相同)。每个价格更改都应以一个组ID开始。

即所需的输出(带有group_id列):

with t
as
(
    select 1::int id , 10.01::numeric price_from , 100::int buyers ,100::int sellers ,  10.02::numeric price_to ,1::int group_id
        union
    select 2::int id , 10.01::numeric price_from , 100::int buyers ,200::int sellers ,  10.02::numeric price_to, 1::int group_id
        union
    select 3::int id , 10.01::numeric price_from , 500::int buyers ,100::int sellers ,  10.02::numeric price_to, 1::int group_id
        union
    select 4::int id , 10.01::numeric price_from , 100::int buyers ,100::int sellers ,  10.03::numeric price_to, 2::int group_id
        union
    select 5::int id , 10.01::numeric price_from , 300::int buyers ,100::int sellers ,  10.03::numeric price_to, 2::int group_id
        union
    select 6::int id , 10.01::numeric price_from , 100::int buyers ,200::int sellers ,  10.02::numeric price_to, 3::int group_id
        union
    select 7::int id , 10.01::numeric price_from , 500::int buyers ,100::int sellers ,  10.02::numeric price_to, 3::int group_id
    order by 1
)
select * 
from t

我尝试使用row_number()和density_rank()函数在价格列上进行分区,但仍然无法得到我想要的东西。

我可以用python或其他脚本语言创建脚本来为我“标记”这些脚本, 但是很想知道是否有一种SQL方法可以在其中一个价格值更改时增加组ID。

在此先感谢您的帮助。

1 个答案:

答案 0 :(得分:1)

也许这会有所帮助:首先我计算price_change,然后将price_changeflag求和 使用窗口函数lag()->在1行之前进行比较。如果您还需要查看price_from,请扩展where / when子句

with t
as
(
    select 1::int id , 10.01::numeric price_from , 100::int buyers ,100::int sellers ,  10.02::numeric price_to ,1::int group_id
        union
    select 2::int id , 10.01::numeric price_from , 100::int buyers ,200::int sellers ,  10.02::numeric price_to, 1::int group_id
        union
    select 3::int id , 10.01::numeric price_from , 500::int buyers ,100::int sellers ,  10.02::numeric price_to, 1::int group_id
        union
    select 4::int id , 10.01::numeric price_from , 100::int buyers ,100::int sellers ,  10.03::numeric price_to, 2::int group_id
        union
    select 5::int id , 10.01::numeric price_from , 300::int buyers ,100::int sellers ,  10.03::numeric price_to, 2::int group_id
        union
    select 6::int id , 10.01::numeric price_from , 100::int buyers ,200::int sellers ,  10.02::numeric price_to, 3::int group_id
        union
    select 7::int id , 10.01::numeric price_from , 500::int buyers ,100::int sellers ,  10.02::numeric price_to, 3::int group_id
    order by 1
), 
t2 as
( 
select
*,
lag(price_to,1,0::numeric) over (ORDER by id) as price_before,
case when lag(price_to,1,0::numeric) over (ORDER by id) <> price_to
then 1
else 0 end  as pricechange
 from t
)

select
*,
sum(pricechange) over (ORDER BY id RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as group_id
 from
t2