我在postgres中有一个简单的表:
select remoteaddr,
count (remoteaddr)
from domain_visitors
group by remoteaddr
having count (remoteaddr) > 500
我使用以下内容进行了调整:
remoteaddr
如何选择其他列,仍然只按{{1}}分组?
答案 0 :(得分:0)
选项1:您可以使用array_agg()
函数将其他列值连接到分组列表中:
SELECT
remoteaddr,
array_agg(DISTINCT username) AS unique_users,
array_agg(username) AS repeated_users,
count(remoteaddr) as remote_count
FROM domain_visitors
GROUP BY remoteaddr;
见this SQL Fiddle。此查询将返回如下所示的内容:
+----------------+---------------------------------+-----------------------------------------------------------------------------------------------------+--------------+
| remoteaddr | unique_users | repeated_users | remote_count |
+----------------+---------------------------------+-----------------------------------------------------------------------------------------------------+--------------+
| 142.4.218.156 | anotheruser,user9688766,vistor1 | user9688766,anotheruser,vistor1,vistor1,vistor1,vistor1,vistor1,anotheruser,anotheruser,anotheruser | 10 |
| 158.69.26.144 | anotheruser,user9688766 | anotheruser,user9688766,user9688766,user9688766,user9688766 | 5 |
| 167.114.209.28 | vistor1 | vistor1 | 1 |
+----------------+---------------------------------+-----------------------------------------------------------------------------------------------------+--------------+
选项2:您可以将第一个查询放在common table expression(又称“WITH”子句)中,并将其与原始表连接,如下所示:
WITH grouped_addr AS (
SELECT remoteaddr, count(remoteaddr) AS remote_count
FROM domain_visitors
GROUP BY remoteaddr
)
SELECT ga.remoteaddr, dv.username, ga.remote_count
FROM grouped_addr ga
INNER JOIN domain_visitors dv
ON ga.remoteaddr = dv.remoteaddr
WHERE remote_count > 500;
这是SQL Fiddle。
请记住,这将返回任何其他列的重复结果(在此示例中为username
)。这不是通常你想要的。请注意Fiddles中的每个SELECT示例,看看哪个最适合您的目的。