SQL窗口函数

时间:2018-11-01 19:29:59

标签: sql oracle window-functions

我有一个看起来像这样的数据框:

ID   DATE
ABC  2018-02-07
ABC  2018-02-10
ABC  2018-02-13
ABC  2018-02-22
ABC  2018-02-26
ABC  2018-02-28
ABC  2018-04-06
ABC  2018-04-06
ABC  2018-04-12

我试图添加3列:(1)所有记录的最早日期(2)日期和最早日期之间的时间(3)返回该记录的第n个出现次数#,返回重复日期的第n个记录的最大值。我期望将以下内容作为输出:

PEL_ID TRANSACTIONDATEDIFF EARLIESTEXPOSURE    TIMEDIFF      NTH_FREQUENCY
ABC  2018-02-07     2018-02-07            0             1
ABC  2018-02-10     2018-02-07            3             2 
ABC  2018-02-13     2018-02-07            6             3
ABC  2018-02-22     2018-02-07           15             4
ABC  2018-02-26     2018-02-07           19             5 
ABC  2018-02-28     2018-02-07           21             6
ABC  2018-04-06     2018-02-07           58             8
ABC  2018-04-12     2018-02-07           64             9

这是我的SQL代码:

SELECT 
PEL_ID,TRANSACTIONDATEDIFF,EARLIESTEXPOSURE,TIME_DIFF,MAX(NTH_FREQUENCY) 
FROM (
SELECT C.*,ROW_NUMBER() OVER(PARTITION BY PEL_ID ORDER BY PEL_ID) AS 
NTH_FREQUENCY FROM
(SELECT A.PEL_ID,A.TRANSACTIONDATEDIFF,B.EARLIESTEXPOSURE, 
(A.TRANSACTIONDATEDIFF-B.EARLIESTEXPOSURE) AS TIME_DIFF FROM
CAMP_31323_TODATE A JOIN (SELECT PEL_ID,MIN(TRANSACTIONDATEDIFF) AS 
EARLIESTEXPOSURE FROM CAMP_31323_TODATE
GROUP BY PEL_ID) B ON A.PEL_ID=B.PEL_ID
ORDER BY A.PEL_ID) C
  )
GROUP BY PEL_ID,TRANSACTIONDATEDIFF,EARLIESTEXPOSURE,TIME_DIFF
ORDER BY PEL_ID,TRANSACTIONDATEDIFF ASC;

除了nth_frequency之外,大多数代码都在工作,这是输出:

PEL_ID TRANSACTIONDATEDIFF EARLIESTEXPOSURE    TIMEDIFF      NTH_FREQUENCY
ABC  2018-02-07     2018-02-07            0             3
ABC  2018-02-10     2018-02-07            3             6 
ABC  2018-02-13     2018-02-07            6             8
ABC  2018-02-22     2018-02-07           15             2
ABC  2018-02-26     2018-02-07           19             7 
ABC  2018-02-28     2018-02-07           21             1
ABC  2018-04-06     2018-02-07           58             5
ABC  2018-04-12     2018-02-07           64             9

我不确定为什么会这样。任何帮助将由衷的感谢。预先感谢。

1 个答案:

答案 0 :(得分:1)

不是完整的解决方案,而是一个起点:

with t as (
  select 'ABC' AS ID, DATE '2018-02-07' as D from dual union all
  select 'ABC' AS ID, DATE '2018-02-10' as D from dual union all
  select 'ABC' AS ID, DATE '2018-02-13' as D from dual union all
  select 'ABC' AS ID, DATE '2018-02-22' as D from dual union all
  select 'ABC' AS ID, DATE '2018-02-26' as D from dual union all
  select 'ABC' AS ID, DATE '2018-02-28' as D from dual union all
  select 'ABC' AS ID, DATE '2018-04-06' as D from dual union all
  select 'ABC' AS ID, DATE '2018-04-06' as D from dual union all
  select 'ABC' AS ID, DATE '2018-04-12' as D from dual),
g as 
   (select  
      ID,
      D as TRANSACTIONDATEDIFF,
      MIN(D) OVER (PARTITION BY ID ORDER BY D) as EARLIESTEXPOSURE,
      D - MIN(D) OVER (PARTITION BY ID ORDER BY D) as TIMEDIFF,
      RANK() OVER (PARTITION BY ID ORDER BY D) AS NTH_FREQUENCY
   from t)
select distinct *
from g
order by 2;


+----------------------------------------------------------------+
|ID  |TRANSACTIONDATEDIFF|EARLIESTEXPOSURE|TIMEDIFF|NTH_FREQUENCY|
+----------------------------------------------------------------+
|ABC |07-FEB-18          |07-FEB-18       |0       |1            |
|ABC |10-FEB-18          |07-FEB-18       |3       |2            |
|ABC |13-FEB-18          |07-FEB-18       |6       |3            |
|ABC |22-FEB-18          |07-FEB-18       |15      |4            |
|ABC |26-FEB-18          |07-FEB-18       |19      |5            |
|ABC |28-FEB-18          |07-FEB-18       |21      |6            |
|ABC |06-APR-18          |07-FEB-18       |58      |7            |
|ABC |12-APR-18          |07-FEB-18       |64      |9            |
+----------------------------------------------------------------+