找到首先找不到的列中的值

时间:2017-04-27 19:29:33

标签: sql postgresql amazon-redshift

我必须编写一个SQL,我必须编写SQL来计算RunID首先没见过。让我用一个例子来解释

例如:

RunID | RunDate    | ErrorID
----- | ---------- | ---------
101   | 04/11/2017 | 1
101   | 04/11/2017 | 2
101   | 04/11/2017 | 3
102   | 04/22/2017 | 2
102   | 04/22/2017 | 3
103   | 04/26/2017 | 1
104   | 04/27/2017 | 3
105   | 04/28/2017 | 4

在上面的例子中,RunID 101有错误1,2,3。 RunID 102有2,3。在第二次运行期间,找不到ErrorID 1。所以,RunID首先在这里看不到,直到现在才102 但是在RunID 103中再次找到ErrorID 1,最后在RunID 104中找不到ErrorID 1.查询应该给出首先找不到104的RunID。

我尝试过使用像铅和滞后这样的窗口函数,但它没有帮助。

以下是预期结果:

首次看不到ErrorID的日期:2

RunID | RunDate    | ErrorID
----- | ---------- | ---------
103   | 04/26/2017 | 2

因为在RunID-102

之后从未见过ErrorID 2(未见过的第一个实例)

ErrorID首次出现的日期:1

RunID | RunDate    | ErrorID
----- | ---------- | ---------
104   | 04/27/2017 | 1

在RunID-104之后从未见过ErrorID 1

首次看不到ErrorID的日期:3

RunID | RunDate    | ErrorID
----- | ---------- | ---------
105   | 04/28/2017 | 3

在RunID-105之后从未见过ErrorID 3

2 个答案:

答案 0 :(得分:1)

so=> with l as (
with m as (
  select distinct max(runid) over(partition by errorid),errorid
  from so80
)
, a as (
  select distinct runid,errorid
  from so80
)
select distinct min(a.runid) over (partition by m.errorid),m.errorid
from m
join a on m.max < a.runid
)
select s.*
from so80 s
join l on l.min=s.runid and s.errorid=s.errorid

;
 runid |   rundate    | errorid
-------+--------------+---------
   104 |  04/27/2017  |       3
   103 |  04/26/2017  |       1
   105 |  04/28/2017  |       4
(3 rows)

答案 1 :(得分:1)

--Get the last runDate when an errorID was seen
with t1 as (select runId,runDate,errorID
            ,first_value(runDate) over(partition by errorID order by runDate desc 
                                       rows between unbounded preceding and unbounded following) as last_seen
            from tablename
           )
--Get the next runDate based on the previous result
,t2 as (select runid,errorID,runDate
        ,(select min(runDate) from t1 t11 where t11.runDate>t1.last_seen) as date_first_not_seen
        from t1
       )
--Join it to the original table to get the runID information from that runDate in the previous result (t2)
select distinct t.runid,t2.errorid,t.rundate
from t2
join tablename t on t.rundate=t2.date_first_not_seen

with t1 as (select runId,runDate,errorID
            ,first_value(runDate) over(partition by errorID order by runDate desc 
                                       rows between unbounded preceding and unbounded following) as last_seen
            from tablename)
select distinct
 t1.errorid
,first_value(t.runDate) over(partition by t1.errorID order by t1.runDate desc rows between unbounded preceding and unbounded following) as rundate
,first_value(t.runID) over(partition by t1.errorID order by t1.runDate desc rows between unbounded preceding and unbounded following) as runid
from t1
join tablename t on t.runDate>t1.last_seen

Sample Demo