在stackoverflow上有很多关于SQL Top N的问题,但我似乎找不到与我所拥有的情况相匹配的问题。我想在前n个查询中执行一些分组。我的数据看起来像这样(显然是假值)。
MY_DATE IP_ADDRESS
1/1/09 999.999.999.999
1/1/09 999.999.999.999
1/1/09 999.999.999.998
... a lot more rows
该表的日期范围涵盖数月,每月有数千行。我想做的是有一个查询告诉我每个月最常发生的10个IP地址。我可以使用以下内容执行此操作一个月:
SELECT DATE_FORMAT(MY_DATE, '%b-%y') AS "MONTH", IP_ADDRESS, COUNT(*) AS HITS
FROM MY_DATA
WHERE DATE_FORMAT(MY_DATE, '%b-%y') = 'JAN-09'
GROUP BY DATE_FORMAT(MY_DATE, '%b-%y'), IP_ADDRESS
ORDER BY HITS DESC
LIMIT 10
但我真正想要的是能够在数据集中看到每个月的前n个。这基本上禁止我使用我指定的where子句。当然,当我这样做的时候,我只是在所有月份都达到10。我正在寻找的结果应该是这样的:
MONTH IP_ADDRESS COUNT(*)
JAN-09 999.999.999.999 200
JAN-09 999.999.999.998 150
... ( 8 more rows of January )
FEB-09 999.999.999.999 320
FEB-09 999.999.999.998 234
... ( 8 more rows of February)
MAR-09 999.999.999.999 440
... ETC.
可以在MySQL中完成吗?似乎我遇到的障碍是MySQL不允许在UNION中包含的查询语句中使用ORDER BY。谢谢你的帮助!
答案 0 :(得分:3)
我刚刚尝试了一个与@Charles Bretana的given非常类似的查询,它确实有效。我使用了VIEW来帮助澄清事情。
CREATE TABLE my_data (
my_date DATE,
ip_address CHAR(15)
);
插入一堆日期/ IP地址对(未显示)......
为每月所有计数和IP地址创建一个视图:
CREATE VIEW my_data_per_month as
SELECT EXTRACT(YEAR_MONTH FROM my_date) AS month,
ip_address, COUNT(*) AS hits
FROM my_data
GROUP BY month, ip_address;
SELECT * FROM my_data_per_month
ORDER BY month ASC, hits DESC;
+--------+-----------------+------+
| month | ip_address | hits |
+--------+-----------------+------+
| 200901 | 999.999.999.999 | 8 |
| 200901 | 999.999.999.998 | 6 |
| 200901 | 999.999.999.997 | 5 |
| 200901 | 999.999.999.996 | 4 |
| 200901 | 999.999.999.995 | 3 |
| 200901 | 999.999.999.994 | 2 |
| 200902 | 999.999.999.998 | 8 |
| 200902 | 999.999.999.997 | 6 |
| 200902 | 999.999.999.996 | 5 |
| 200902 | 999.999.999.995 | 4 |
| 200902 | 999.999.999.994 | 3 |
| 200902 | 999.999.999.993 | 2 |
| 200903 | 999.999.999.997 | 8 |
| 200903 | 999.999.999.996 | 6 |
| 200903 | 999.999.999.995 | 5 |
| 200903 | 999.999.999.994 | 4 |
| 200903 | 999.999.999.993 | 3 |
| 200903 | 999.999.999.992 | 2 |
+--------+-----------------+------+
现在每月显示前三个IP地址:
SELECT m1.month, m1.ip_address, m1.hits
FROM my_data_per_month m1
LEFT OUTER JOIN my_data_per_month m2
ON (m1.month = m2.month AND m1.hits < m2.hits)
GROUP BY m1.month, m1.ip_address
HAVING COUNT(*) < 3
ORDER BY m1.month ASC, m1.hits DESC;
+--------+-----------------+------+
| month | ip_address | hits |
+--------+-----------------+------+
| 200901 | 999.999.999.999 | 8 |
| 200901 | 999.999.999.998 | 6 |
| 200901 | 999.999.999.997 | 5 |
| 200902 | 999.999.999.998 | 8 |
| 200902 | 999.999.999.997 | 6 |
| 200902 | 999.999.999.996 | 5 |
| 200903 | 999.999.999.997 | 8 |
| 200903 | 999.999.999.996 | 6 |
| 200903 | 999.999.999.995 | 5 |
+--------+-----------------+------+
答案 1 :(得分:0)
这是第一次粗略猜测,但试试这个
Select Month, Address
From (Select DATE_FORMAT(MY_DATE, '%b-%y') Month,
IP_Address Address, Count(*) AddressCount
From MY_DATA
Group By DATE_FORMAT(MY_DATE, '%b-%y'), IP_Adress) Z
Join(Select DATE_FORMAT(MY_DATE, '%b-%y') Month,
IP_Address Address, Count(*) AddressCount
From MY_DATA
Group By DATE_FORMAT(MY_DATE, '%b-%y'), IP_Adress) ZZ
On ZZ.Month = Z.Month
And ZZ.AddressCount >= Z.AddressCount
Group By Z.Month, Z.Address
Where Count(ZZ.AddressCount) >= 10