我有此查询,需要花一些时间(大约10分钟)。
SELECT COUNT(DISTINCT(column)) var,
column2 var2
FROM table
WHERE column<>'X' and
column2 between 'Y' and 'Z'
GROUP BY var2
ORDER BY var DESC
任何想法如何优化速度?我尝试使用索引,但仍然很慢。也许它们设置不正确。 Y和Z是时间戳(如果重要),而X是此查询根本不需要的东西,但在表中,因为同一应用程序的其他查询都需要X。该表非常大-数百万行,并且尚未增长。
编辑:以下是示例的解释结果:
mysql> EXPLAIN SELECT COUNT(DISTINCT(ip)) v, geo n from idevaff_iptracking where geo<>'XX' and stamp between '1525122000' and '1543615199' group by n order by v desc;
+------+-------------+--------------------+-------+------------------------+--------------+---------+------+---------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+--------------------+-------+------------------------+--------------+---------+------+---------+-----------------------------------------------------------+
| 1 | SIMPLE | idevaff_iptracking | range | stamp,geo,geo_stamp_ip | geo_stamp_ip | 9 | NULL | 3469323 | Using where; Using index; Using temporary; Using filesort |
+------+-------------+--------------------+-------+------------------------+--------------+---------+------+---------+-----------------------------------------------------------+
1 row in set (0.00 sec)
表中的表格如下:
id,acct_id,ip,refer,stamp,hit_time,hit_date,src1,src2,split,sub_id,tid1,tid2,tid3,tid4,target_url,geo。
索引如下:
mysql> SHOW INDEX FROM idevaff_iptracking
-> ;
+--------------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| idevaff_iptracking | 0 | PRIMARY | 1 | id | A | 6775984 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_ip | 1 | acct_id | A | 2 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_ip | 2 | ip | A | 6775984 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | ip | 1 | ip | A | 6775984 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | stamp | 1 | stamp | A | 6775984 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id | 1 | acct_id | A | 4 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | geo | 1 | geo | A | 440 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | tid1 | 1 | tid1 | A | 276 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | tid2 | 1 | tid2 | A | 514 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | tid3 | 1 | tid3 | A | 34 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | tid4 | 1 | tid4 | A | 5623 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | acct_id_stamp_ip | 1 | acct_id | A | 744 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_stamp_ip | 2 | stamp | A | 6775984 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_stamp_ip | 3 | ip | A | 6775984 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | geo_stamp_ip | 1 | geo | A | 22362 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | geo_stamp_ip | 2 | stamp | A | 6775984 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | geo_stamp_ip | 3 | ip | A | 6775984 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid1_stamp | 1 | acct_id | A | 658 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid1_stamp | 2 | tid1 | A | 11866 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid1_stamp | 3 | stamp | A | 6775984 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid2_stamp | 1 | acct_id | A | 2 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid2_stamp | 2 | tid2 | A | 18666 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid2_stamp | 3 | stamp | A | 6775984 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid3_stamp | 1 | acct_id | A | 2 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid3_stamp | 2 | tid3 | A | 1832 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid3_stamp | 3 | stamp | A | 6775984 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid4_stamp | 1 | acct_id | A | 2 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid4_stamp | 2 | tid4 | A | 5060 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid4_stamp | 3 | stamp | A | 6775984 | NULL | NULL | | BTREE | | |
+--------------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
29 rows in set (0.00 sec)
答案 0 :(得分:1)
添加此综合索引:
INDEX(column2, column)
如果这还不够,我们需要查看SHOW CREATE TABLE
以便进行进一步讨论。 (geo_stamp_ip
不好。)
跨列展开数组(tid's
)通常是一个错误。
EXPLAIN FORMAT=JSON
SELECT COUNT(DISTINCT ip) v, geo n
from idevaff_iptracking
where geo<>'XX'
and stamp between '1525122000' AND '1543615199'
group by n
order by v desc;
某些索引是多余的。通常,如果您有INDEX(a)
,则可以删除INDEX(a,b)
。 (例如:acct_id_ip
)