Question

此查询的正确索引是什么。

我尝试为此查询提供不同的索引组合，但它仍然使用tempory，使用filesort等。

总表数据 - 7,60,346

product ='礼服' - 总行数= 122 554

CREATE TABLE IF NOT EXISTS `product_data` (
  `table_id` int(11) NOT NULL AUTO_INCREMENT,
  `id` int(11) NOT NULL,
  `price` int(11) NOT NULL,
  `store` varchar(255) NOT NULL,
  `brand` varchar(255) DEFAULT NULL,
  `product` varchar(255) NOT NULL,
  `model` varchar(255) NOT NULL,
  `size` varchar(50) NOT NULL,
  `discount` varchar(255) NOT NULL,
  `gender_id` int(11) NOT NULL,
  `availability` int(11) NOT NULL,
  PRIMARY KEY (`table_id`),
  UNIQUE KEY `table_id` (`table_id`),
  KEY `id` (`id`),
  KEY `discount` (`discount`),
  KEY `step_one` (`product`,`availability`),
  KEY `step_two` (`product`,`availability`,`brand`,`store`),
  KEY `step_three` (`product`,`availability`,`brand`,`store`,`id`),
  KEY `step_four` (`brand`,`store`),
  KEY `step_five` (`brand`,`store`,`id`)
) ENGINE=InnoDB ;

查询：

SELECT id ,store,brand FROM `product_data` WHERE product='dresses' and 
availability='1' group by brand,store order by store limit 10;

excu..time： - （总共10次，查询耗时1.0941秒）

EXPLAIN PLAN：

possible_keys： - step_one，step_two，step_three，step_four，step_five

键： - step_two

ref： - const，const

行： - 229438

额外： - 使用where;使用临时;使用filesort

我试过这些索引

Key step_one（product，availability）

Key step_two（product，availability，brand，store）

Key step_three（product，availability，brand，store，id）

Key step_four（brand，store）

Key step_five（brand，store，id）

Answer 1

对于索引，最好的索引是step_two。产品字段用于比可用性字段更多的变体。

关于查询的几点说明：

可用性=＆＃39; 1＆＃39;应该是availability = 1，这样就可以避免不必要的int-＆gt; varchar转换。
＆＃34;按品牌分组＆＃34;不应该使用GROUP BY只应在将聚合函数用作选定列时使用。您试图通过该小组实现的目标是什么？

Answer 2

如果没有汇总功能，您的Refresh: 5; url=http://www.example.org/fresh-as-a-summer-breeze确实没有意义。

如果您可以将查询重新写入

In [1]: import requests

In [2]: res = requests.get("http://www.aaai.org/ocs/index.php/SOCS/SOCS16/paper/viewFile/13951/13240")

In [3]: res
Out[3]: <Response [200]>

In [4]: res.text
Out[4]: ''

In [5]: res.headers
Out[5]: {'Date': 'Fri, 29 Sep 2017 10:52:14 GMT', 'Server': 'Apache', 'Refresh': '0; url=https://www.aaai.org/ocs/index.php/SOCS/SOCS16/paper/viewFile/13951/13240', 'Set-Cookie': 'OCSSID=c5eifnobt0942860sraccb2cs0; path=/ocs/', 'Content-Length': '0', 'Keep-Alive': 'timeout=5, max=100', 'Connection': 'Keep-Alive', 'Content-Type': 'text/html; charset=UTF-8'}

In [6]: res.headers['Refresh']
Out[6]: '0; url=https://www.aaai.org/ocs/index.php/SOCS/SOCS16/paper/viewFile/13951/13240'

In [7]: res.headers['Refresh'].split("url=")[-1]
Out[7]: 'https://www.aaai.org/ocs/index.php/SOCS/SOCS16/paper/viewFile/13951/13240'

然后，group by clause产品SELECT id ,store FROM `product_data` WHERE product='dresses' and availability='1' order by store limit 10;可用性(商店,上的索引将删除所有文件分组。

参见SQLFiddle：http://sqlfiddle.com/#!9/60f33d/2

更新：

SQLFiddle让您的意图明确 - 您正在使用,来模拟)。如果是这种情况，我不认为您可以删除查询中的filesort和临时表格步骤 - 但我也不认为这些步骤应该非常昂贵。

Answer 3

真正的问题不是索引，而是GROUP BY和ORDER BY之间的不匹配阻止利用LIMIT。

此

INDEX(product, availability, store, brand, id)

将覆盖＆＃34;并按正确的顺序。但请注意，我已交换了store和brand ...

将查询更改为

SELECT  id ,store,brand
    FROM  `product_data`
    WHERE  product='dresses'
      and  availability='1'
    GROUP BY store, brand    -- change
    ORDER BY store, brand    -- change
    limit  10;

这会将GROUP BY更改为以store开头，以反映ORDER BY排序 - 这可以避免额外排序。并且它将ORDER BY更改为与GROUP BY相同，以便两者可以合并。

鉴于这些更改，INDEX现在可以一直到LIMIT，从而允许处理只查看10行，而不是更大的集合。

任何低于所有这些变化的东西都不会那么有效。

进一步讨论：

INDEX(product, availability,   -- these two can be in either order
      store, brand,      -- must match both `GROUP BY` and `ORDER BY`
      id)   -- tacked on (on the end) to make it "covering"

＆＃34;覆盖＆＃34;表示在SELECT中找到了{em>所有 INDEX列的列，因此无需覆盖数据。

但是...... 由于id中包含SELECT，整个查询没有意义。如果你想找到哪些商店有可用的礼服，那么摆脱id。如果您要列出所有可用的礼服，请将id更改为GROUP_CONCAT(id)。

如何在Extra中优化MYSQL：-Using where;使用临时;使用filesort

3 个答案: