Teradata--符合排名但未给出确切结果

时间:2014-08-02 08:24:23

标签: sql teradata

我遇到了一个问题,我想要基于max(last_updated_date)的列的两个最新值(REVISION)。

情境:

每个部件列(硬件部件)都有一个修订列。每当零件发生任何变化时,修改都会发生变化,并且last_updated日期也会发生变化。修订可以是任何数字,字母, - _等等。

假设我有100个零件,60个零件没有变化,40个零件有变化。所以40将至少有两个最新版本。总的来说,输出中会有 60 + 40 * 2 = 140个部分

在不使用QUALIFY的情况下,我获得了超过500万个不同的部分。所以至少我应该获得5M记录(如果没有任何部分进行任何修改)。

SELECT DISTINCT FROM_NAME
FROM(
select
CDR_ODS_R_GE_OBJ_HST .FROM_ID as FROM_ID,
CDR_ODS_R_GE_OBJ_HST .FROM_NAME as FROM_NAME  ,
CDR_ODS_R_GE_OBJ_HST .FROM_REVISION as FROM_REVISION,
max(CDR_ODS_R_GE_OBJ_HST .LAST_UPDATE_DATE) as LAST_UPDATE_DATE
--RANK( )  OVER ( ORDER BY max(CDR_ODS_R_GE_OBJ_HST .LAST_UPDATE_DATE) DESC) AS RANK1
from GEEDW_PLM_ODS_BULK_V.CDR_ODS_R_GE_OBJ_HST CDR_ODS_R_GE_OBJ_HST
--WHERE CDR_ODS_R_GE_OBJ_HST.FROM_name='323A4747UUP15A' 
--QUALIFY RANK1<=2 
group by 1,2,3
) TM

14728.20721.304.13308(from_id)R-0331128(from_name) - (修订版)8/7/2013 20:30:02(last_updated date)

但是使用qualify&lt; = 2并且排名只得到186个部分。

select
CDR_ODS_R_GE_OBJ_HST .FROM_ID as FROM_ID,
CDR_ODS_R_GE_OBJ_HST .FROM_NAME as FROM_NAME  ,
CDR_ODS_R_GE_OBJ_HST .FROM_REVISION as FROM_REVISION,
max(CDR_ODS_R_GE_OBJ_HST .LAST_UPDATE_DATE) as LAST_UPDATE_DATE
,RANK( )  OVER ( ORDER BY max(CDR_ODS_R_GE_OBJ_HST .LAST_UPDATE_DATE) DESC) AS RANK1
from GEEDW_PLM_ODS_BULK_V.CDR_ODS_R_GE_OBJ_HST CDR_ODS_R_GE_OBJ_HST
--WHERE CDR_ODS_R_GE_OBJ_HST.FROM_name='323A4747UUP15A' 
QUALIFY RANK1<=2
group by 1,2,3

在Nutshell中,从下面的查询中,我希望修订版0024的前两个值对应于最新的2个更新日期。

select FROM_NAME,FROM_REVISION, LAST_UPDATE_DATE
from GEEDW_PLM_ODS_BULK_V.CDR_ODS_R_GE_OBJ_HST CDR_ODS_R_GE_OBJ_HST


0024    301345498360631 1/24/2014 11:22:17
0024    431365606243002 12/16/2013 20:16:44
0024    491333037555534 6/6/2013 18:08:51

1 个答案:

答案 0 :(得分:1)

您需要添加PRATITION BY,否则它会对所有部分进行排名。在你的情况下,186个部分具有相同的最大日期,所有部分都具有相同的等级 1 ,而下一个等级 187

select
   CDR_ODS_R_GE_OBJ_HST .FROM_ID as FROM_ID,
   CDR_ODS_R_GE_OBJ_HST .FROM_NAME as FROM_NAME  ,
   CDR_ODS_R_GE_OBJ_HST .FROM_REVISION as FROM_REVISION,
   max(CDR_ODS_R_GE_OBJ_HST .LAST_UPDATE_DATE) as LAST_UPDATE_DATE
  ,RANK( )
   OVER (PARTITION BY FROM_NAME
         ORDER BY max(CDR_ODS_R_GE_OBJ_HST.LAST_UPDATE_DATE) DESC) AS RANK1
from GEEDW_PLM_ODS_BULK_V.CDR_ODS_R_GE_OBJ_HST CDR_ODS_R_GE_OBJ_HST
--WHERE CDR_ODS_R_GE_OBJ_HST.FROM_name='323A4747UUP15A' 
group by 1,2,3
QUALIFY RANK1<=2
不过,你确定需要GROUP BY吗?也许你把它与旧的弃用的RANK语法混淆了,其中使用了GROUP BY而不是PARTITION。这可能会更有效地返回相同的结果:

select
   CDR_ODS_R_GE_OBJ_HST .FROM_ID as FROM_ID,
   CDR_ODS_R_GE_OBJ_HST .FROM_NAME as FROM_NAME  ,
   CDR_ODS_R_GE_OBJ_HST .FROM_REVISION as FROM_REVISION,
   CDR_ODS_R_GE_OBJ_HST .LAST_UPDATE_DATE as LAST_UPDATE_DATE
  ,RANK( )
   OVER (PARTITION BY FROM_NAME
         ORDER BY max(CDR_ODS_R_GE_OBJ_HST.LAST_UPDATE_DATE) DESC) AS RANK1
from GEEDW_PLM_ODS_BULK_V.CDR_ODS_R_GE_OBJ_HST CDR_ODS_R_GE_OBJ_HST
--WHERE CDR_ODS_R_GE_OBJ_HST.FROM_name='323A4747UUP15A' 
QUALIFY RANK1<=2