我有一个如下样本数据,想要获得所需的o / p,请帮助我一些想法。
我希望第3,第4行的prev_diff_value的o / p为 2015-01-01 00:00:00 ,而不是 2015-01-02 00:00: 00。
with dat as (
select 1 as id,'20150101 02:02:50'::timestamp as dt union all
select 1,'20150101 03:02:50'::timestamp union all
select 1,'20150101 04:02:50'::timestamp union all
select 1,'20150102 02:02:50'::timestamp union all
select 1,'20150102 02:02:50'::timestamp union all
select 1,'20150102 02:02:51'::timestamp union all
select 1,'20150103 02:02:50'::timestamp union all
select 2,'20150101 02:02:50'::timestamp union all
select 2,'20150101 03:02:50'::timestamp union all
select 2,'20150101 04:02:50'::timestamp union all
select 2,'20150102 02:02:50'::timestamp union all
select 1,'20150104 02:02:50'::timestamp
)-- select * from dat
select id , dt , lag(trunc(dt)) over(partition by id order by dt asc) prev_diff_value
from dat
order by id,dt desc
O/P :
id dt prev_diff_value
1 2015-01-04 02:02:50 2015-01-03 00:00:00
1 2015-01-03 02:02:50 2015-01-02 00:00:00
1 2015-01-02 02:02:51 2015-01-02 00:00:00
1 2015-01-02 02:02:50 2015-01-02 00:00:00
1 2015-01-02 02:02:50 2015-01-01 00:00:00
答案 0 :(得分:2)
据我所知,您希望获取id分区中每个时间戳的先前不同日期。然后,我会针对lag
和id
的唯一组合应用date
,然后像这样加入原始数据集:
with dat as (
select 1 as id,'20150101 02:02:50'::timestamp as dt union all
select 1,'20150101 03:02:50'::timestamp union all
select 1,'20150101 04:02:50'::timestamp union all
select 1,'20150102 02:02:50'::timestamp union all
select 1,'20150102 02:02:50'::timestamp union all
select 1,'20150102 02:02:51'::timestamp union all
select 1,'20150103 02:02:50'::timestamp union all
select 2,'20150101 02:02:50'::timestamp union all
select 2,'20150101 03:02:50'::timestamp union all
select 2,'20150101 04:02:50'::timestamp union all
select 2,'20150102 02:02:50'::timestamp union all
select 1,'20150104 02:02:50'::timestamp
)
,dat_unique_lag as (
select *, lag(date) over(partition by id order by date asc) prev_diff_value
from (
select distinct id,trunc(dt) as date
from dat
)
)
select *
from dat
join dat_unique_lag
using (id)
where trunc(dat.dt)=dat_unique_lag.date
order by id,dt desc;
然而,这不是超级高效的。如果您的数据的性质是同一天您的时间戳数量有限,那么您可以使用如下条件语句延长滞后时间:
with dat as (
select 1 as id,'20150101 02:02:50'::timestamp as dt union all
select 1,'20150101 03:02:50'::timestamp union all
select 1,'20150101 04:02:50'::timestamp union all
select 1,'20150102 02:02:50'::timestamp union all
select 1,'20150102 02:02:50'::timestamp union all
select 1,'20150102 02:02:51'::timestamp union all
select 1,'20150103 02:02:50'::timestamp union all
select 2,'20150101 02:02:50'::timestamp union all
select 2,'20150101 03:02:50'::timestamp union all
select 2,'20150101 04:02:50'::timestamp union all
select 2,'20150102 02:02:50'::timestamp union all
select 1,'20150104 02:02:50'::timestamp
)
select id, dt,
case
when lag(trunc(dt)) over(partition by id order by dt asc)=trunc(dt)
then case
when lag(trunc(dt),2) over(partition by id order by dt asc)=trunc(dt)
then case
when lag(trunc(dt),3) over(partition by id order by dt asc)=trunc(dt)
then lag(trunc(dt),4) over(partition by id order by dt asc)
else lag(trunc(dt),3) over(partition by id order by dt asc)
end
else lag(trunc(dt),2) over(partition by id order by dt asc)
end
else lag(trunc(dt)) over(partition by id order by dt asc)
end as prev_diff_value
from dat
order by id,dt desc;
基本上,你看一下之前的记录,如果它不适合你,那么你回头看那个记录之前的记录,依此类推,直到你找到正确的记录或用完你的陈述深度。在这里,直到第4条记录为止。
答案 1 :(得分:1)
这是一种看待问题的不同方式,虽然效率不高,但还是挺有趣的。
with dat as (
select 1 as id,'20150101 02:02:50'::timestamp as dt union all
select 1,'20150101 03:02:50'::timestamp union all
select 1,'20150101 04:02:50'::timestamp union all
select 1,'20150102 02:02:50'::timestamp union all
select 1,'20150102 02:02:50'::timestamp union all
select 1,'20150102 02:02:51'::timestamp union all
select 1,'20150103 02:02:50'::timestamp union all
select 2,'20150101 02:02:50'::timestamp union all
select 2,'20150101 03:02:50'::timestamp union all
select 2,'20150101 04:02:50'::timestamp union all
select 2,'20150102 02:02:50'::timestamp union all
select 1,'20150104 02:02:50'::timestamp
)
select distinct
dat.id
,dat.dt
,last_value(dat2.d) over (partition by dat.id, dat.dt order by dat2.d asc rows between unbounded preceding and unbounded following) as prev_diff_value
from dat
left join (
select distinct
id
,trunc(dt) as d
from dat) dat2 on dat.id = dat2.id and trunc(dat.dt) > dat2.d
order by 1,2,3;
这将绘制出不同的 id 和日期对,并仅在连接日期早于相关行的情况下将它们重新连接到数据集上。然后,last_value 函数将获取每行的最后一个值,并且 distinct 从输出中删除所有不相关的行。我知道这个问题已经有几年了 - 但我偶然发现了它并且玩得很开心。