查找每个组中最接近给定日期的记录-SQL

时间:2019-05-20 19:06:21

标签: sql amazon-redshift

我是sql新手。假设我们有一个这样的表:

+-------+----------+-----------+
|userid | statusid |   date    |
+-------+----------+-----------+
| 1     |  1       | 2018-10-10| 
| 1     |  2       | 2018-10-12|
| 2     |  1       | 2018-09-25|
| 2     |  1       | 2018-10-01|
+-------+----------+-----------+

我需要获取每个用户ID的状态ID,以使其日期尽可能接近给定的日期。说我的给定日期是2018-10-01。我该怎么做?我尝试了各种groupby和partition by,但是没有任何效果。有人可以帮忙吗?

编辑:我的数据库是亚马逊redshift

2 个答案:

答案 0 :(得分:2)

您可以使用row_number()窗口分析函数,按日期差的绝对值排序。

(请注意,row_number()MySQL 8-不起作用,因此不使用该功能,而使用abs()函数。)

我不知道您的DBMS

此解决方案适用于Oracle

with tab(userid, statusid, "date") as
(
 select 1,1,date'2018-10-10' from dual union all
 select 1,2,date'2018-10-12' from dual union all
 select 2,1,date'2018-09-25' from dual union all
 select 2,1,date'2018-10-02' from dual
)
select tt.userid, tt.statusid, tt."date"
  from
(
select t.userid, t.statusid , t."date",
       row_number() over (partition by t.userid 
                          order by abs("date" - date'2018-10-01')) as rn
  from tab t
) tt
where tt.rn = 1

Demo for Oracle

此解决方案适用于SQL Server

with tab([userid], [statusid], [date]) as
(
 select 1,1,'2018-10-10' union all
 select 1,2,'2018-10-12' union all
 select 2,1,'2018-09-25' union all
 select 2,1,'2018-10-02' 
)
select tt.[userid], tt.[statusid], tt.[date]
  from
(
select t.[userid], t.[statusid] , t.[date], 
       row_number() over (partition by t.[userid] 
                          order by abs(datediff(day,[date],'2018-10-01'))) as rn
  from tab t
) tt
where tt.rn = 1

Demo for SQL Server

该解决方案适用于My SQL

select tt.userid, tt.statusid, tt.date
  from
  (
   select t.userid, t.statusid , t.date,
          @rn := if(@iter = t.userid, @rn + 1, 1) as rn,
          @iter := t.userid, 
          abs(date - date'2018-10-01') as df
     from tab t
     join (select @iter := 0, @rn := 0) as q_iter
    order by t.userid, abs(date - date'2018-10-01') 
  ) tt
where tt.rn = 1

Demo for My SQL

此解决方案适用于PostGRES

with tab(userid, statusid, date) as
(
 select 1,1,'2018-10-10' union all
 select 1,2,'2018-10-12' union all
 select 2,1,'2018-09-25' union all
 select 2,1,'2018-10-02' 
)
select tt.userid, tt.statusid, tt.date
  from
(
select t.userid, t.statusid , t.date, 
       row_number() over (partition by t.userid
                          order by abs(date::date-'2018-10-01'::date)) as rn
  from tab t
) tt
where tt.rn = 1

Demo for PostGRESql

答案 1 :(得分:0)

通常,对于这种类型的问题,您希望日期在给定日期或之前。

如果是这样:

select t.*
from t
where t.date = (select max(t2.date)
                from t t2
                where t2.userid = t.userid and t2.date <= '2018-10-01'
               );