Question

我有2个MySQL（Ver 14.14 Distrib 5.5.49）表，看起来像这样：

CREATE TABLE `Document` (
    `Id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `CompanyCode` int(10) unsigned NOT NULL,
    `B` int(10) unsigned NOT NULL,
    `C` int(10) unsigned NOT NULL,
    `DocumentCode` int(10) unsigned NOT NULL,
    `E` int(11) DEFAULT '0',
    `EpochSeconds` int(11) DEFAULT '0',
    `G` int(10) unsigned NOT NULL,
    `H` int(10) unsigned NOT NULL,
    `I` int(11) DEFAULT '0',
    `J` int(11) DEFAULT '0',
    `K` varchar(48) DEFAULT '',
  PRIMARY KEY (`Id`),
    KEY `Idx1` (`CompanyCode`),
    KEY `Idx2` (`B`,`C`),
    KEY `Idx3` (`CompanyCode`,`DocumentCode`),
    KEY `Idx4` (`CompanyCode`,`B`,`C`),
    KEY `Idx5` (`H`),
    KEY `Idx6` (`CompanyCode`,`K`),
    KEY `Idx7` (`K`),
    KEY `Idx8` (`K`,`E`),
    KEY `NEWIDX` (`DocumentCode`,`EpochSeconds`),
) ENGINE=MyISAM AUTO_INCREMENT=397783215 DEFAULT CHARSET=latin1

CREATE TABLE `Company` (
    `Id` int(10) unsigned NOT NULL AUTO_INCREMENT,
    `CompanyCode` int(10) unsigned NOT NULL,
    `CompanyName` varchar(150) NOT NULL,
    `C` varchar(2) NOT NULL,
    `D` varchar(10) NOT NULL,
    `E` varchar(150) NOT NULL,
  PRIMARY KEY (`Id`),
    KEY `Idx1` (`CompanyCode`),
    KEY `Idx2` (`CompanyName`),
    KEY `Idx3` (`C`),
    KEY `Idx4` (`D`,`C`)
    KEY `Idx5` (`E`)
) ENGINE=MyISAM AUTO_INCREMENT=9218804 DEFAULT CHARSET=latin1

我省略了Company中的大多数列定义，因为我不想让这个问题不必要地复杂化，但那些缺少的列不会涉及任何KEY定义。< / em>的

Document有大约1,250万行，Company有大约600,000行我已将KEY NEWIDX添加到Document以方便以下查询：

SELECT Document。*，Company.CompanyName FROM Document，Company Document.DocumentCode =？和Document.CompanyCode = Company.CompanyCode ORDER BY Document.EpochSeconds desc LIMIT 0,30;

执行计划：

+----+-------------+--------------+------+-----------------------------------+-------------+---------+------------------------------+--------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+------------------------------------------+-------------+---------+------------------------------+--------+---------------------------------+ | 1 | SIMPLE | Company | ALL | Idx1 | NULL | NULL | NULL | 593729 | Using temporary; Using filesort | | 1 | SIMPLE | Document | ref | Idx1,Idx4,Idx6,NEWIDX,Idx3 | Idx3 | 8 | db.Company.CompanyCode,const | 3 | | +----+-------------+-------+------+-----------------------------------------------------------+-------------+---------+----------------------+--------+------------------------+

如果上面Document.DocumentCode的值不是8，则查询会立即返回（0.00秒）。如果值为8，则查询需要38到45秒之间的任何值。如果我从查询中删除Company，例如

SELECT * FROM Document其中DocumentCode = 8 ORDER BY EpochSeconds desc LIMIT 0,30;

执行计划：

+----+-------------+-----------+------+---------------+------------+---------+-------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-----------+------+---------------+------------+---------+-------+---------+-------------+ | 1 | SIMPLE | Documents | ref | NEWIDX | NEWIDX | 4 | const | 3654177 | Using where | +----+-------------+-----------+------+---------------+------------+---------+-------+---------+-------------+

...然后查询立即返回（0.00秒）。

Document.DocumentCode的可能值范围为369，在这些值之间有足够的分布。

Document中有~315万行DocumentCode = 8。

另外，请考虑Document中有大约150万行DocumentCode = 9，并且该查询会立即返回。

我还在mysqlcheck表上运行Document实用程序，但它没有报告任何问题。

为什么在查询中使用Company连接时，DocumentCode = 8的查询可能会花费这么长时间，而DocumentCode的任何其他值都会如此快地返回？

以下是DocumentCode = 8的执行计划的比较：

+----+-------------+--------------+------+-----------------------------------+-------------+---------+------------------------------+--------+---------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+------------------------------------------+-------------+---------+------------------------------+--------+---------------------------------+ | 1 | SIMPLE | Company | ALL | Idx1 | NULL | NULL | NULL | 593729 | Using temporary; Using filesort | | 1 | SIMPLE | Document | ref | Idx1,Idx4,Idx6,NEWIDX,Idx3 | Idx3 | 8 | db.Company.CompanyCode,const | 3 | | +----+-------------+-------+------+-----------------------------------------------------------+-------------+---------+----------------------+--------+------------------------+

和DocumentCode = 9：

+----+-------------+----------+------+----------------------------+--------+---------+--------------------------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+------+----------------------------+--------+---------+--------------------------+---------+-------------+ | 1 | SIMPLE | Document | ref | Idx1,Idx4,Idx6,NEWIDX,Idx3 | NEWIDX | 4 | const | 1953090 | Using where | | 1 | SIMPLE | Company | ref | Idx1 | Idx1 | 4 | db.Document.CompanyCode | 1 | | +----+-------------+----------+------+----------------------------+--------+---------+--------------------------+---------+-------------+

他们显然是不同的，但我不太了解他们，以解释正在发生的事情。此外，执行ANALYZE TABLE Document和ANALYZE TABLE Company报告OK。

Answer 1

这种行为的原因在于mysql优化查询的方式 - 或者至少尝试过。您在解释的查询中看到了这一点。 Mysql更改它用作查询基础的表。使用documentCode = 8，它基于公司，documentCode = 9，它基于文档。 Mysql认为，对于documentCode = 8，它会更快，如果它不使用索引而是使用另一个表作为基础。为什么我不知道。

我想请你使用一个explizit连接，告诉mysql使用哪个表格顺序：

SELECT Document.*, Company.CompanyName 
FROM Document 
JOIN Company ON Document.CompanyCode = Company.CompanyCode 
WHERE Document.DocumentCode = ?
ORDER BY Document.EpochSeconds desc LIMIT 0, 30;

Mysql甚至支持告诉它，它应该使用什么索引：

SELECT Document.*, Company.CompanyName 
FROM Document 
JOIN Company USE INDEX Idx1 ON Document.CompanyCode = Company.CompanyCode 
WHERE Document.DocumentCode = ?
ORDER BY Document.EpochSeconds desc LIMIT 0, 30;

您也可以尝试FORCE INDEX而不是USE INDEX。那更强。但我想它默认使用Idx1。

但请注意，您的新索引NEWIDX不会用于此查询，因为它需要先加入并过滤结果集 - 它没有索引。因此，结果上的ORDER BY是一项非常昂贵的操作。

Answer 2

使用STRAIGHT_JOIN强制MySQL在

中进行连接的顺序

SELECT Document.*, 
Company.CompanyName 
FROM Document
STRAIGHT_JOIN Company 
ON Document.CompanyCode = Company.CompanyCode
WHERE Document.DocumentCode = ? 
ORDER BY Document.EpochSeconds DESC
LIMIT 0, 30;

MySQL索引查询需要很长时间才能获得特定的列值

2 个答案: