获取每个组的最高记录

时间:2015-09-23 15:26:15

标签: postgresql select group-by greatest-n-per-group

我在下一个查询中得到以下结果。

SELECT   app, srcip,
     Sum(COALESCE(sentbyte, 0)+COALESCE(rcvdbyte, 0)) AS bandwidth 
FROM     $log 
WHERE    logid_to_int(logid) NOT IN (4, 
                                 7, 
                                 14)
GROUP BY app, srcip
HAVING   sum(COALESCE(sentbyte, 0)+COALESCE(rcvdbyte, 0))>0 
ORDER BY bandwidth

结果:

+----------+----------------+----------+
| app      | srcip          | bandwidth|
+----------+----------------+----------+
| TCP/8080 | 91.236.75.4    | 40       |
| TCP/8080 | 198.74.123.215 | 40       |
| SMTP     | 80.82.64.127   | 44       |
| YouTube  | 192.168.1.170  | 52       |
| TCP/8080 | 121.40.233.129 | 52       |
| HTTP     | 192.168.1.167  | 60       |
| HTTP     | 192.168.1.218  | 60       |
| HTTP     | 199.203.59.117 | 96       |
+----------+----------------+----------+

我被要求能够通过app获得前15名或前20名的以下结果。

+----------+----------------+----------+
| app      | srcip          | bandwidth|
+----------+----------------+----------+
| TCP/8080 | 121.40.233.129 | 52       |
| TCP/8080 | 91.236.75.4    | 40       |
| TCP/8080 | 198.74.123.215 | 40       |
| SMTP     | 80.82.64.127   | 44       |
| YouTube  | 192.168.1.170  | 52       |
| HTTP     | 199.203.59.117 | 96       |
| HTTP     | 192.168.1.167  | 60       |
| HTTP     | 192.168.1.218  | 60       |
+----------+----------------+----------+

我尝试过修改GROUP BY子句但是没有用。非常感谢任何帮助。

1 个答案:

答案 0 :(得分:1)

我们说dataset是您查询的结果。然后查询

select *, row_number() over (partition by app order by bandwidth)
from dataset

row_number添加新列dataset

   app    |     srcip      | bandwidth | row_number
----------+----------------+-----------+------------
 HTTP     | 192.168.1.218  |        60 |          1
 HTTP     | 192.168.1.167  |        60 |          2
 HTTP     | 199.203.59.117 |        96 |          3
 SMTP     | 80.82.64.127   |        44 |          1
 TCP/8080 | 91.236.75.4    |        40 |          1
 TCP/8080 | 198.74.123.215 |        40 |          2
 TCP/8080 | 121.40.233.129 |        52 |          3
 YouTube  | 192.168.1.170  |        52 |          1
(8 rows)

每个app的行都有编号。现在很容易限制每组中所选行的数量:

select app, srcip, bandwidth
from (
    select *, row_number() over (partition by app order by bandwidth)
    from dataset
    ) sub
where row_number < 3

   app    |     srcip      | bandwidth
----------+----------------+-----------
 HTTP     | 192.168.1.218  |        60
 HTTP     | 192.168.1.167  |        60
 SMTP     | 80.82.64.127   |        44
 TCP/8080 | 91.236.75.4    |        40
 TCP/8080 | 198.74.123.215 |        40
 YouTube  | 192.168.1.170  |        52
(6 rows)

阅读the documentation中的窗口函数。