这是我的表格(间距仅为了更好的视图)
month country temp
1 AU 15
1 AU 20
2 AU 10
3 AU 20
3 AU 20
3 AU 15
1 CZ 10
1 CZ 5
1 CZ 3
2 CZ 15
2 CZ 15
3 CZ 20
3 CZ 10
1 DE 8
1 DE 2
2 DE 16
2 DE 12
3 DE 21
3 DE 19
我需要显示每月的最高平均温度,并使用SQL显示HIVE中的国家/地区。像这样的输出
month country max_avg_temp
1 AU 17.5
2 CZ 15
3 DE 20
所以我需要按国家和月份计算avg(temp)组,然后在每个月找到最大值。
我试过这样的事情:
SELECT month, country, MAX(avg_temp) AS max_avg_temp
FROM (SELECT month, country, AVG(temp) as avg_temp FROM temperature
GROUP BY month, country) alias
GROUP BY month;
我也想到这样的事情:
SELECT *
, RANK() OVER (PARTITION BY month ORDER BY avg_temp DESC) AS rank
FROM (SELECT month, country, AVG(temp) AS avg_temp FROM temperature
GROUP BY month, country)
ORDER BY rank
LIMIT 12;
但它没有奏效(有些错误)。平均工作的嵌套选择很好,但我不知道如何显示最大值。
提前致谢
答案 0 :(得分:1)
使用row_number()
过滤平均值最高的记录
SELECT month, country, avg_temp
FROM ( SELECT month,
country,
AVG (temp) AS avg_temp,
ROW_NUMBER ()
OVER (PARTITION BY month ORDER BY AVG (temp) DESC)
rnk
FROM temperature
GROUP BY month, country) t
WHERE rnk = 1;
答案 1 :(得分:0)
只是愚蠢的错误,我忘记了一些“别名” - AS t
SELECT *
, RANK() OVER (PARTITION BY mesic ORDER BY avg_temp DESC) AS rank
FROM (SELECT mesic, stat, AVG(teplota) AS avg_temp FROM teploty
GROUP BY mesic, stat) AS t
ORDER BY rank, mesic
limit 12;
它确实显示每个蛾的最高平均温度
+--------------+-------------+---------------------+-------+--+
| t.month | t.country | t.avg_temp | rank |
+--------------+-------------+---------------------+-------+--+
| 1 | AS | 28.1482513153823 | 1 |
| 2 | AS | 28.268547273982037 | 1 |
| 3 | AS | 28.305594514570675 | 1 |
| 4 | PW | 28.14697855750485 | 1 |
| 5 | PW | 28.15864931145062 | 1 |
| 6 | MH | 28.085874246339472 | 1 |
| 7 | AZ | 29.07906528274055 | 1 |
| 8 | TX | 28.335826966364785 | 1 |
| 9 | MH | 28.220327304048165 | 1 |
| 10 | MH | 28.22359756605796 | 1 |
| 11 | MH | 28.125538329026583 | 1 |
| 12 | AS | 27.996042413381048 | 1 |
+--------------+-------------+---------------------+-------+--+