从表WHERE col <>'X'和col2在'Y'和'Z'之间优化SELECT COUNT(DISTINCT(col))var,col2 var2 GROUP BY var2 ORDER BY VAR DESC;为了速度?

时间:2018-11-21 12:50:29

标签: mysql sql database performance mariadb

我有此查询,需要花一些时间(大约10分钟)。

SELECT COUNT(DISTINCT(column)) var, 
       column2 var2 
FROM table 
WHERE column<>'X' and 
      column2 between 'Y' and 'Z' 
GROUP BY var2 
ORDER BY var DESC

任何想法如何优化速度?我尝试使用索引,但仍然很慢。也许它们设置不正确。 Y和Z是时间戳(如果重要),而X是此查询根本不需要的东西,但在表中,因为同一应用程序的其他查询都需要X。该表非常大-数百万行,并且尚未增长。

编辑:以下是示例的解释结果:

    mysql> EXPLAIN SELECT COUNT(DISTINCT(ip)) v, geo n from idevaff_iptracking where geo<>'XX' and stamp between '1525122000' and '1543615199' group by n order by v desc;
+------+-------------+--------------------+-------+------------------------+--------------+---------+------+---------+-----------------------------------------------------------+
| id   | select_type | table              | type  | possible_keys          | key          | key_len | ref  | rows    | Extra                                                     |
+------+-------------+--------------------+-------+------------------------+--------------+---------+------+---------+-----------------------------------------------------------+
|    1 | SIMPLE      | idevaff_iptracking | range | stamp,geo,geo_stamp_ip | geo_stamp_ip | 9       | NULL | 3469323 | Using where; Using index; Using temporary; Using filesort |
+------+-------------+--------------------+-------+------------------------+--------------+---------+------+---------+-----------------------------------------------------------+
1 row in set (0.00 sec)

表中的表格如下:

id,acct_id,ip,refer,stamp,hit_time,hit_date,src1,src2,split,sub_id,tid1,tid2,tid3,tid4,target_url,geo。

索引如下:

    mysql> SHOW INDEX FROM idevaff_iptracking
    -> ;
+--------------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table              | Non_unique | Key_name           | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| idevaff_iptracking |          0 | PRIMARY            |            1 | id          | A         |     6775984 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_ip         |            1 | acct_id     | A         |           2 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_ip         |            2 | ip          | A         |     6775984 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | ip                 |            1 | ip          | A         |     6775984 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | stamp              |            1 | stamp       | A         |     6775984 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id            |            1 | acct_id     | A         |           4 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | geo                |            1 | geo         | A         |         440 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | tid1               |            1 | tid1        | A         |         276 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | tid2               |            1 | tid2        | A         |         514 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | tid3               |            1 | tid3        | A         |          34 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | tid4               |            1 | tid4        | A         |        5623 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_stamp_ip   |            1 | acct_id     | A         |         744 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_stamp_ip   |            2 | stamp       | A         |     6775984 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_stamp_ip   |            3 | ip          | A         |     6775984 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | geo_stamp_ip       |            1 | geo         | A         |       22362 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | geo_stamp_ip       |            2 | stamp       | A         |     6775984 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | geo_stamp_ip       |            3 | ip          | A         |     6775984 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid1_stamp |            1 | acct_id     | A         |         658 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid1_stamp |            2 | tid1        | A         |       11866 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid1_stamp |            3 | stamp       | A         |     6775984 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid2_stamp |            1 | acct_id     | A         |           2 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid2_stamp |            2 | tid2        | A         |       18666 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid2_stamp |            3 | stamp       | A         |     6775984 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid3_stamp |            1 | acct_id     | A         |           2 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid3_stamp |            2 | tid3        | A         |        1832 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid3_stamp |            3 | stamp       | A         |     6775984 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid4_stamp |            1 | acct_id     | A         |           2 |     NULL | NULL   |      | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid4_stamp |            2 | tid4        | A         |        5060 |     NULL | NULL   | YES  | BTREE      |         |               |
| idevaff_iptracking |          1 | acct_id_tid4_stamp |            3 | stamp       | A         |     6775984 |     NULL | NULL   |      | BTREE      |         |               |
+--------------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
29 rows in set (0.00 sec)

1 个答案:

答案 0 :(得分:1)

添加此综合索引:

INDEX(column2, column)

如果这还不够,我们需要查看SHOW CREATE TABLE以便进行进一步讨论。 (geo_stamp_ip不好。)

跨列展开数组(tid's)通常是一个错误。

EXPLAIN FORMAT=JSON
SELECT  COUNT(DISTINCT ip) v, geo n
    from  idevaff_iptracking
    where  geo<>'XX'
      and  stamp between '1525122000' AND '1543615199'
    group by  n
    order by  v desc;

某些索引是多余的。通常,如果您有INDEX(a),则可以删除INDEX(a,b)。 (例如:acct_id_ip