我的MySQL索引是否有效?

时间:2017-03-03 11:21:32

标签: mysql indexing innodb

我有以下表格:

mysql> describe as_rilevazioni;
+----------------------------+----------+------+-----+---------+----------------+
| Field                      | Type     | Null | Key | Default | Extra          |
+----------------------------+----------+------+-----+---------+----------------+
| id                         | int(11)  | NO   | PRI | NULL    | auto_increment |
| id_sistema_di_monitoraggio | longtext | NO   | MUL | NULL    |                |
| id_unita                   | longtext | NO   |     | NULL    |                |
| id_sensore                 | longtext | NO   |     | NULL    |                |
| data                       | datetime | NO   |     | NULL    |                |
| timestamp                  | longtext | NO   |     | NULL    |                |
| unita_di_misura            | longtext | NO   |     | NULL    |                |
| misura                     | longtext | NO   |     | NULL    |                |
+----------------------------+----------+------+-----+---------+----------------+
8 rows in set (0.00 sec)

我的桌子上有以下索引:

mysql> show indexes from as_rilevazioni;
+----------------+------------+----------+--------------+----------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table          | Non_unique | Key_name | Seq_in_index | Column_name                | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------------+------------+----------+--------------+----------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| as_rilevazioni |          0 | PRIMARY  |            1 | id                         | A         |   315865898 |     NULL | NULL   |      | BTREE      |         |               |
| as_rilevazioni |          0 | UNIQUE   |            1 | id_sistema_di_monitoraggio | A         |          17 |        5 | NULL   |      | BTREE      |         |               |
| as_rilevazioni |          0 | UNIQUE   |            2 | id_unita                   | A         |          17 |       10 | NULL   |      | BTREE      |         |               |
| as_rilevazioni |          0 | UNIQUE   |            3 | id_sensore                 | A         |      145225 |       30 | NULL   |      | BTREE      |         |               |
| as_rilevazioni |          0 | UNIQUE   |            4 | data                       | A         |   315865898 |     NULL | NULL   |      | BTREE      |         |               |
+----------------+------------+----------+--------------+----------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
5 rows in set (0.02 sec)

我担心这些索引效率不高,因为索引的基数基于列"数据"和记录数据一样大! 这些索引可以加快我的查询速度,或者在没有任何好处的情况下占用大量空间?

这是表格定义:

CREATE TABLE `as_rilevazioni` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `id_sistema_di_monitoraggio` longtext NOT NULL,
  `id_unita` longtext NOT NULL,
  `id_sensore` longtext NOT NULL,
  `data` datetime NOT NULL,
  `timestamp` longtext NOT NULL,
  `unita_di_misura` longtext NOT NULL,
  `misura` longtext NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `UNIQUE` (`id_sistema_di_monitoraggio`(5),`id_unita`(10),`id_sensore`(30),`data`)
) ENGINE=InnoDB AUTO_INCREMENT=437497044 DEFAULT CHARSET=latin1

我使用的主要查询是:

select * from as_rilevazioni where id_sistema_di_monitoraggio="<value>" and id_unita="<value>" and id_sensore="<value>" and data>="<date_1>" and data<="<date2>"

这是查询解释:

mysql> explain select * from as_rilevazioni where id_sistema_di_monitoraggio="235" and id_unita="17" and id_sensore="15" and data >= "2015-01-01 00:00:00" order by data;
+----+-------------+----------------+-------+---------------+--------+---------+------+--------+-------------+
| id | select_type | table          | type  | possible_keys | key    | key_len | ref  | rows   | Extra       |
+----+-------------+----------------+-------+---------------+--------+---------+------+--------+-------------+
|  1 | SIMPLE      | as_rilevazioni | range | UNIQUE        | UNIQUE | 59      | NULL | 285522 | Using where |
+----+-------------+----------------+-------+---------------+--------+---------+------+--------+-------------+
1 row in set (0.00 sec)

这是数据和索引的维度:

mysql> SELECT concat(table_schema,'.',table_name) tables,
    ->        concat(round(table_rows/1000000,2),'M') rows,
    ->        concat(round(data_length/(1024*1024*1024),2),'G') data_size,
    ->        concat(round(index_length/(1024*1024*1024),2),'G') index_size,
    ->        concat(round((data_length+index_length)/(1024*1024*1024),2),'G') total_size,
    ->        round(index_length/data_length,2) index_data_ratio
    -> FROM information_schema.TABLES
    -> WHERE table_name="as_rilevazioni"
    -> ORDER BY total_size DESC;
+------------------------------------+---------+-----------+------------+------------+------------------+
| tables                             | rows    | data_size | index_size | total_size | index_data_ratio |
+------------------------------------+---------+-----------+------------+------------+------------------+
| agriculturalsupport.as_rilevazioni | 317.12M | 19.06G    | 10.25G     | 29.31G     |             0.54 |
+------------------------------------+---------+-----------+------------+------------+------------------+
1 row in set (0.02 sec)

有什么建议吗? 谢谢大家!

1 个答案:

答案 0 :(得分:0)

UNIQUE a(5), b(10)

太可怕了。它要检查a的前5个字节与b的前10个字节的唯一性。您可能希望检查完整ab的组合是否具有唯一性。

INDEX a(5), b(10)

几乎无用 - 即使考虑a也不会超过b

INDEX a(5)

有时无用。

UNIQUE a, data  -- where `data` is `DATETIME` or `TIMESTAMP`

通常是“错误的”。你真的确定a一秒钟内不会出现两次吗?

在查看多列索引时,“基数”通常并不重要。基数等于表中估计的行数意味着它认为该列是唯一的;但它不会指望它。

“有效”,你的意思是“不占用太多'空间”吗? UNIQUE索引的每个“行”大约需要1 + 5 + 1 + 10 + 1 + 30 + 5 = 53个字节。多达317M,你得到17GB。添加约40%的开销以获得23GB。这比information_schema中的10GB要多得多。 (错误涉及许多近似值 - 可能主要是行数。)

或者,你的意思是“这个指数可以加速一些查询”吗?要讨论这个问题,我们需要查看查询。 (与此同时,我已经指出了指数不好的几个原因。)

如果ID是数字

如果它们确实是数字,则切换到SMALLINT UNSIGNED(2个字节)或其他大小。 然后包含这4列(以及data last )的索引很可能会显着加快查询速度。是的,索引将花费一些磁盘空间,但可能值得。 TEXT,带有“前缀”,根本无法提供效率。

索引数字也比字符串便宜。您的id_unita(10)在索引的每一行中最多占用11个字节; MEDIUMINT UNSIGNED需要固定的3个字节。也就是说,索引将更小更有用。