SQL QUALIFY等效的HIVE查询

时间:2015-04-28 07:25:24

标签: sql hive oracle-sqldeveloper hiveql row-number

我正在尝试从var timeRow = mAvailabilityDS.Time.FirstOrDefault(x => x.DateKey.Date == dateKey.Date); 创建HIVE query。基本上我想选择第一条记录,按Oracle SQL query排序descending

UPDATED_TM, DATETIME, ID_NUM

我尝试使用等效的SELECT tbl1.NUM AS ID, tbl1.UNIT AS UNIT, tbl2.VALUE AS VALUE, tbl1.CONTACT AS CONTACT_NAME, 'FILE' AS SOURCE, CURDATE() AS DATE FROM DB1.TBL1 tbl1 LEFT JOIN DB1.TBL2 tbl2 ON tbl1.USR_ID = tbl2.USR_ID WHERE tbl1.UNIT IS NOT NULL AND tbl1.TYPE = 'Generic' QUALIFY ROW_NUMBER() OVER (PARTITION BY tbl1.ROW_ID ORDER BY tbl1.UPDATED_TM DESC, tbl1.DATETIME DESC, tbl1.ID_NUM DESC) = 1 (但也兼容sql):

Hive query

这看起来是否正确?有什么办法可以优化查询吗?我正在使用的表非常大,我希望尽可能提高效率。

感谢。

1 个答案:

答案 0 :(得分:2)

    SELECT 
  tbl1.NUM AS ID,
  tbl1.UNIT AS UNIT,  
  tbl2.VALUE AS VALUE,
  tbl1.CONTACT AS CONTACT_NAME,
  'FILE' AS SOURCE,
  CURDATE() AS DATE
FROM
(
SELECT 
    USR_ID, TYPE, NUM, UNIT, ROW_NUMBER() OVER (PARTITION BY tbl.ROW_ID ORDER BY tbl.UPDATED_TM DESC, tbl.DATETIME DESC, tbl.ID_NUM DESC) AS RNUM
FROM
    (
        SELECT 
                USR_ID,TYPE,NUM,UNIT,ROW_ID,UPDATED_TM,DATETIME,ID_NUM 
            FROM DB1.TBL1
        WHERE UNIT IS NOT NULL 
        AND TYPE = 'Generic'
    )tbl
)tbl1
LEFT OUTER JOIN
DB1.TBL2 tbl2
ON tbl1.USR_ID = tbl2.USR_ID
WHERE tbl1.RNUM = 1;