我正在做一个统计报告项目,以显示搜索关键词的最高数量,该表有大约5000万条记录。
表格(简体):
+----------------------+--------------+
| Field | Type |
+----------------------+--------------+
| acct | varchar(5) |
| branch | varchar(2) |
| page_name | varchar(20) |
| access_time | datetime |
| query_input | varchar(500) |
+----------------------+--------------+
page_name
可以是3个值:'search'
'detail'
或'cart'
我需要的是按page_name
分组的每种query_input
类型,并在一个查询中以限制的降序对行进行计数。最初,我只是让休眠模式获取所有记录,然后在Java中处理它们,但是即使我使用无状态会话,查询也会花费很长时间。
为了减小从休眠状态返回的数据的大小,我尝试了
(SELECT page_name, query_input, count(*) FROM table_name WHERE acct='XXXXX' AND page_name='search 'GROUP BY query_input ORDER BY COUNT(*) DESC LIMIT 100)
UNION ALL
(SELECT ... AND page_name='detail' ...)
UNION ALL
(SELECT ... AND page_name='cart' ...)
但是这将导致数据库循环表3次,是否有一种方法可以重新表述查询,使其仅循环表一次,但得到的结果与我想要的相同?
例如,无限制:
+----------------------+--------------+---------+
| page_name | query_input | count(*)|
+----------------------+--------------+---------+
| search | CCC | 10 |
| search | EEE | 8 |
| search | AAA | 1 |
| search | BBB | 1 |
| detail | DDD | 12 |
| detail | FFF | 11 |
| detail | HHH | 1 |
| detail | GGG | 1 |
| cart | III | 6 |
| cart | JJJ | 4 |
| cart | LLL | 1 |
| cart | KKK | 1 |
+----------------------+--------------+---------+
限制2:
+----------------------+--------------+---------+
| page_name | query_input | count(*)|
+----------------------+--------------+---------+
| search | CCC | 10 |
| search | EEE | 8 |
| detail | DDD | 12 |
| detail | FFF | 11 |
| cart | III | 6 |
| cart | JJJ | 4 |
+----------------------+--------------+---------+
更新
我感觉到这是无法解决的,因为我意识到我实际上是在尝试从表中选择通过对其进行排序,而对于MySQL,不考虑进行排序考虑选择...我是对的
答案 0 :(得分:0)
尝试使用in
运算符:
SELECT page_name, query_input, count(*)
FROM table_name
WHERE acct='XXXXX'
AND page_name IN ('search', 'detail', 'cart')
GROUP BY query_input
ORDER BY COUNT(*) DESC
LIMIT 100