我遇到了一个问题,我想要基于max(last_updated_date)的列的两个最新值(REVISION)。
情境:
每个部件列(硬件部件)都有一个修订列。每当零件发生任何变化时,修改都会发生变化,并且last_updated日期也会发生变化。修订可以是任何数字,字母, - _等等。
假设我有100个零件,60个零件没有变化,40个零件有变化。所以40将至少有两个最新版本。总的来说,输出中会有 60 + 40 * 2 = 140个部分。
在不使用QUALIFY的情况下,我获得了超过500万个不同的部分。所以至少我应该获得5M记录(如果没有任何部分进行任何修改)。
SELECT DISTINCT FROM_NAME
FROM(
select
CDR_ODS_R_GE_OBJ_HST .FROM_ID as FROM_ID,
CDR_ODS_R_GE_OBJ_HST .FROM_NAME as FROM_NAME ,
CDR_ODS_R_GE_OBJ_HST .FROM_REVISION as FROM_REVISION,
max(CDR_ODS_R_GE_OBJ_HST .LAST_UPDATE_DATE) as LAST_UPDATE_DATE
--RANK( ) OVER ( ORDER BY max(CDR_ODS_R_GE_OBJ_HST .LAST_UPDATE_DATE) DESC) AS RANK1
from GEEDW_PLM_ODS_BULK_V.CDR_ODS_R_GE_OBJ_HST CDR_ODS_R_GE_OBJ_HST
--WHERE CDR_ODS_R_GE_OBJ_HST.FROM_name='323A4747UUP15A'
--QUALIFY RANK1<=2
group by 1,2,3
) TM
14728.20721.304.13308(from_id)R-0331128(from_name) - (修订版)8/7/2013 20:30:02(last_updated date)
但是使用qualify&lt; = 2并且排名只得到186个部分。
select
CDR_ODS_R_GE_OBJ_HST .FROM_ID as FROM_ID,
CDR_ODS_R_GE_OBJ_HST .FROM_NAME as FROM_NAME ,
CDR_ODS_R_GE_OBJ_HST .FROM_REVISION as FROM_REVISION,
max(CDR_ODS_R_GE_OBJ_HST .LAST_UPDATE_DATE) as LAST_UPDATE_DATE
,RANK( ) OVER ( ORDER BY max(CDR_ODS_R_GE_OBJ_HST .LAST_UPDATE_DATE) DESC) AS RANK1
from GEEDW_PLM_ODS_BULK_V.CDR_ODS_R_GE_OBJ_HST CDR_ODS_R_GE_OBJ_HST
--WHERE CDR_ODS_R_GE_OBJ_HST.FROM_name='323A4747UUP15A'
QUALIFY RANK1<=2
group by 1,2,3
在Nutshell中,从下面的查询中,我希望修订版0024的前两个值对应于最新的2个更新日期。
select FROM_NAME,FROM_REVISION, LAST_UPDATE_DATE
from GEEDW_PLM_ODS_BULK_V.CDR_ODS_R_GE_OBJ_HST CDR_ODS_R_GE_OBJ_HST
0024 301345498360631 1/24/2014 11:22:17
0024 431365606243002 12/16/2013 20:16:44
0024 491333037555534 6/6/2013 18:08:51
答案 0 :(得分:1)
您需要添加PRATITION BY,否则它会对所有部分进行排名。在你的情况下,186个部分具有相同的最大日期,所有部分都具有相同的等级 1 ,而下一个等级 187 。
select
CDR_ODS_R_GE_OBJ_HST .FROM_ID as FROM_ID,
CDR_ODS_R_GE_OBJ_HST .FROM_NAME as FROM_NAME ,
CDR_ODS_R_GE_OBJ_HST .FROM_REVISION as FROM_REVISION,
max(CDR_ODS_R_GE_OBJ_HST .LAST_UPDATE_DATE) as LAST_UPDATE_DATE
,RANK( )
OVER (PARTITION BY FROM_NAME
ORDER BY max(CDR_ODS_R_GE_OBJ_HST.LAST_UPDATE_DATE) DESC) AS RANK1
from GEEDW_PLM_ODS_BULK_V.CDR_ODS_R_GE_OBJ_HST CDR_ODS_R_GE_OBJ_HST
--WHERE CDR_ODS_R_GE_OBJ_HST.FROM_name='323A4747UUP15A'
group by 1,2,3
QUALIFY RANK1<=2
不过,你确定需要GROUP BY吗?也许你把它与旧的弃用的RANK语法混淆了,其中使用了GROUP BY而不是PARTITION。这可能会更有效地返回相同的结果:
select
CDR_ODS_R_GE_OBJ_HST .FROM_ID as FROM_ID,
CDR_ODS_R_GE_OBJ_HST .FROM_NAME as FROM_NAME ,
CDR_ODS_R_GE_OBJ_HST .FROM_REVISION as FROM_REVISION,
CDR_ODS_R_GE_OBJ_HST .LAST_UPDATE_DATE as LAST_UPDATE_DATE
,RANK( )
OVER (PARTITION BY FROM_NAME
ORDER BY max(CDR_ODS_R_GE_OBJ_HST.LAST_UPDATE_DATE) DESC) AS RANK1
from GEEDW_PLM_ODS_BULK_V.CDR_ODS_R_GE_OBJ_HST CDR_ODS_R_GE_OBJ_HST
--WHERE CDR_ODS_R_GE_OBJ_HST.FROM_name='323A4747UUP15A'
QUALIFY RANK1<=2