我有一个非常简单的表(LOG),包含MAC_ADDR,IP_SRC,IP_DST,URL,PROTOCOL属性。我希望第一个包含IP_SRC,URL,#OfOccurrences的 n 行,当PROTOCOL ='DNS'时,通过减少表中每个IP_SRC的#OfOccurrences来排序。
为了更清楚,我希望能够在我的表中为每个IP_SRC列出第一个 n 访问量最大的页面。
我可以为每个IP_SRC获取访问量最大的URL:
select ip_src,url,cnt
from (
select ip_src,url,count(*) as cnt,protocol
from log as b group by ip_src,url order by ip_src,cnt desc
) as c
where cnt>=(select MAX(cpt)
from (select count(*) as cpt from log as b
where c.ip_src==b.ip_src group by ip_src,url)
)
and protocol='DNS';
然而,这个解决方案显然没有得到优化。
这是一个更实用的代码(针对每个IP_SRC访问量最大的URL):
select ip_src,url,cnt
from (select ip_src,url,count(*) as cnt
from log where protocol='DNS'
group by ip_src,url
order by ip_src,cnt asc)
group by ip_src;
第二种选择更快!但是,我想要每个IP_SRC的 n 访问量最大的页面,我无法弄清楚如何做。
感谢您的帮助。
答案 0 :(得分:1)
WITH Temp1 AS (
SELECT ip_src, url, count(*) AS cnt
FROM Log
WHERE protocol = 'DNS'
GROUP BY ip_src, url
)
SELECT ip_src, url, cnt
FROM Temp1 AS T1
WHERE url IN (
SELECT url
FROM Temp1 AS T2
WHERE T2.ip_src = T1.ip_src
AND T2.cnt >= T1.cnt
ORDER BY cnt DESC
LIMIT 3 -- or whatever you want it to be
)
ORDER BY ip_src ASC, cnt DESC;
答案 1 :(得分:0)
select x.ip_src, x.url, x.cnt
from (select ip_src,url,count(*) as cnt
from log where protocol='DNS'
group by ip_src,url
order by ip_src, count(*) desc) AS x
group by x.ip_src;
你可以尝试一下吗?
答案 2 :(得分:0)
最后,通过使用临时表,我可以设法得到我想要的东西。
--First create a temp table of occurences
CREATE TEMPORARY TABLE TEMP1 AS
SELECT ip_src,url,count(*) AS cnt
FROM LOG
WHERE protocol='DNS'
GROUP BY ip_src,url
ORDER BY ip_src,cnt,url DESC;
--Then use a classic limit per group query
SELECT T1.ip_src,T1.url,T1.cnt
FROM TEMP1 AS T1
WHERE T1.url in (
SELECT T2.url
FROM TEMP1 AS T2
WHERE T2.ip_src=T1.ip_src and T2.cnt>=T1.cnt
ORDER BY T2.cnt DESC
LIMIT 3 --Or whatever you want it to be
)
ORDER BY T1.ip_src ASC,T1.cnt DESC;
如果有人知道怎么做而不需要临时表(或者解释为什么临时表是一个好的解决方案),请表达自己。