Question

我们有一个表MySql，以下是架构

CREATE TABLE campaigns (
  domain varchar(50) ,
  campaign_id bigint(12) ,
  log_time datetime ,
  log_type int,
  node_id bigint(12) 
)

有关表格的简要信息

一个域可以有多个广告系列，一个广告系列可以有多个节点

表有1.5亿行。独特的域名是40k。

我想在此表上创建一个综合索引，以获取有关广告系列级和节点级的报告

假设我创建了如下所示的复合索引

KEY campid_domain_nodeid_logtime (`campaign_id`,`domain`,`node_id`,`log_time`)

它是否完全满足以下查询，这意味着在广告系列级和节点级

广告系列级报告

select count(*) from campaigns 
where domain = 'aaa' and campaign_id = '1235' 
and log_time between '2016-01-01 00:00:00' and '2016-02-02 00:00:00'

节点级报告

select count(*) from campaigns
       where domain = 'aaa' and campaign_id = '1235' and node_id = '2345' and  log_time between '2016-01-01 00:00:00' and '2016-02-02 00:00:00

由于

Answer 1

您可以将索引视为具有快速查找的订单列表。如果你有一个带有字段A，B，C，D的复合索引，那么列表将在A上排序，然后是排序与A相同的行，而不是C，而不是D.

A1 | B1 | C1 | D1 | -> pointer to row
A1 | B1 | C1 | D2 | -> pointer to row
A1 | B1 | C2 | D1 | -> pointer to row
A1 | B1 | C2 | D2 | -> pointer to row
A1 | B2 | C1 | D1 | -> pointer to row
...
A2 | B1 | C1 | D1 | -> pointer to row
A2 | B1 | C1 | D2 | -> pointer to row

查询优化器会检查您的查询。如果您的查询要求A，B，C，D，一切都很好。对于一个好的数据库，查询的顺序无关紧要，因此您也可以编写查询where D and C and B and A。

如果你的查询只询问A，那么一切都很好，因为所有具有相同A的行都是一个接一个。

如果您的查询仅询问D，则索引无效。具有相同D但不同A的行分布在整个列表中。

如果您的查询要求A，B，D，就像您的广告系列级报告一样，则索引会有所帮助。它可以用来加速A和B的查找，但是它需要迭代所有数据，因为C缺失。

您可以定义多个索引。缺点当然是每个附加索引都会使写入速度变慢，并且需要在硬盘驱动器上留出一些空间。

Answer 2

不，复合索引不会帮助您以此格式列出的2个查询中的任何一个。 where条件中的字段需要与索引中的字段顺序相同。

我还会通过将log_time移动到第3个位置来更改索引中字段的顺序：

KEY campid_domain_nodeid_logtime (`campaign_id`,`domain`,`log_time`,`node_id`)

第一个查询更改campaign_id和域的顺序：

select count(*) from campaigns 
where campaign_id = '1235' and domain = 'aaa'
and log_time between '2016-01-01 00:00:00' and '2016-02-02 00:00:00'

第二个查询更改campaign_id和domain + node_id和log_time的顺序：

select count(*) from campaigns
where  campaign_id = '1235' and domain = 'aaa'
   and  log_time between '2016-01-01 00:00:00' and '2016-02-02 00:00:00' 
   and node_id = '2345'

您可以运行explain来验证索引的使用。如果您有任何与节点相关的查询（不对log_time进行过滤），那么这些查询只能使用索引的campaign_id和域部分。

Answer 3

https://dev.mysql.com/doc/refman/5.6/en/multiple-column-indexes.html

上面的链接说明了多列索引中索引的顺序

按以下顺序为列创建索引

domain，campaign_id，nodeid，log time

并将节点级别报告更改为

select count(*) from campaigns
where domain = 'aaa' and campaign_id = '1235'  
and  log_time between '2016-01-01 00:00:00' and '2016-02-02 00:00:00'
and node_id = '2345'

MySql多列索引的工作原理

3 个答案: