我有一个包含各种群集的记录大小的表以及日期群集的扫描。我需要根据最新扫描日期获取每个群集中每个群集的大小。 我在Impala SQL中尝试以下查询,但它不会产生结果。
Scandata cluster Size
11/4/2017 ABC 200
11/18/2017 ABC 700
11/25/2017 ABC 1009
12/4/2017 ABC 200
12/18/2017 ABC 700
12/20/2017 ABC 1100
1/4/2018 ABC 200
1/18/2018 ABC 700
1/20/2018 ABC 1009
11/4/2017 CAD 200
11/18/2017 CAD 700
11/25/2017 CAD 1009
12/4/2017 CAD 200
12/18/2017 CAD 700
12/20/2017 CAD 1100
预期结果
Data cluster Size
11/25/2017 ABC 1009
12/20/2017 ABC 1100
1/20/2018 ABC 1009
11/25/2017 CAD 1009
12/20/2017 CAD 1100
SELECT t.*
FROM arxview.test_summary t
INNER JOIN
(SELECT MONTH(scandate) AS month, MAX(DAY(scandate)) AS day, cluster
FROM arxview.test_summary t
GROUP BY MONTH(scandate), cluster) sub
ON (MONTH(t.scandate) = sub.month AND DAY(t.scandate) = sub.day AND t.cluster = sub.cluster)
答案 0 :(得分:2)
另一种方法使用窗口函数:
select ts.*
from (select ts.*,
max(scandate) over (partition by year(scandate), month(scandate) as max_scandate_monthly
from arxview.test_summary t
) ts
where scandate = max_scandate_monthly;