我有一个表存储apache日志数据的表。看起来像:
CREATE TABLE access_log
(
id bigint NOT NULL,
client_ip character varying(255),
host character varying(255),
host_name character varying(255),
log_date timestamp without time zone,
method character varying(255),
module character varying(255),
protocol character varying(255),
referer character varying(4096),
size bigint,
status_code integer,
system character varying(255),
url character varying(4096),
user_agent character varying(1024),
CONSTRAINT access_log_pkey PRIMARY KEY (id )
)
在postrgesql中我需要知道前20个发生的行,其中status_code,url和方法是相同的。 此查询工作正常,但我无法获取其他列数据,如log_date,协议,不在group by子句中的内容。
select status_code,url,method from access_log
group by status_code,url,method
order by count(*) desc
limit 20
我怎么能这么做呢?我会有很多行,大约6万行,所以性能是非常重要的因素。
答案 0 :(得分:1)
select *
from (
select al.*, count(*) over (partition by status_code, url, method) as cnt
from access_log al
) t
order by cnt desc
limit 20
答案 1 :(得分:0)
您可以使用distinct on
键,例如: -
SELECT DISTINCT ON (status_code,url,method) * FROM access_log;