使用SQL查找具有极值的行

时间:2015-08-10 12:21:34

标签: sql group-by hive sas proc-sql

我有一个表t1,有4列:

key, cd, date, result_num

在SAS中,我们有以下代码:

PROC SQL;
    create table t2 AS
    select * from t1
    group by key
    having date = MAX(date)
    order by key, cd;
RUN;

我的印象是,使用聚合函数(例如MAX)时所选的所有列都必须在group by中,或者应用了聚合函数。我的目标是将此SAS代码转换为SQL,有没有办法在SQL中执行此操作(更具体地说是hiveQL)?

2 个答案:

答案 0 :(得分:1)

我不认为您的查询在SAS中正在执行您想要的操作。 。 。也许确实如此。在标准SQL(和Hive)中,您可以执行以下操作:

create table t2 AS
    select *
    from (select t1.*,
                 row_number() over (partition by key order by date desc) as seqnum
          from t1
         ) t1
    where seqnum = 1
    order by key, cd;

答案 1 :(得分:1)

诀窍是访问输入表两次:一次计算最大日期,一次选择适当的数据

如果您查找日期是整个表格中出现的最高日期的行,即

PROC SQL;
    create table t2 AS
    select * from t1
    where date = (select MAX(date) from t1)
    order by key, cd;
RUN;

如果您查找的日期是相同密钥的最高日期,即

PROC SQL;
    create table t2 AS
    select * from t1 inner join 
    (  select MAX(date) as maxDate 
       from t1  
       group by key) as m1 
       on m1.key = t1.key and m1.maxDate = t1.date
    order by key, cd;
RUN;