我有一个表t1,有4列:
key, cd, date, result_num
在SAS中,我们有以下代码:
PROC SQL;
create table t2 AS
select * from t1
group by key
having date = MAX(date)
order by key, cd;
RUN;
我的印象是,使用聚合函数(例如MAX)时所选的所有列都必须在group by中,或者应用了聚合函数。我的目标是将此SAS代码转换为SQL,有没有办法在SQL中执行此操作(更具体地说是hiveQL)?
答案 0 :(得分:1)
我不认为您的查询在SAS中正在执行您想要的操作。 。 。也许确实如此。在标准SQL(和Hive)中,您可以执行以下操作:
create table t2 AS
select *
from (select t1.*,
row_number() over (partition by key order by date desc) as seqnum
from t1
) t1
where seqnum = 1
order by key, cd;
答案 1 :(得分:1)
诀窍是访问输入表两次:一次计算最大日期,一次选择适当的数据
如果您查找日期是整个表格中出现的最高日期的行,即
PROC SQL;
create table t2 AS
select * from t1
where date = (select MAX(date) from t1)
order by key, cd;
RUN;
如果您查找的日期是相同密钥的最高日期,即
PROC SQL;
create table t2 AS
select * from t1 inner join
( select MAX(date) as maxDate
from t1
group by key) as m1
on m1.key = t1.key and m1.maxDate = t1.date
order by key, cd;
RUN;