Question

我正在执行一个MySQL连接查询，它永远不会完成：

SELECT t1.`id` FROM `person` as t1 
      JOIN `temp_table` as t2 
      on t1.`date` = t2.`date` 
      and t1.`name` = t2.`name` 
      and t1.`country_id`= t2.`country_id`

人员表和temp_table具有完全相同的列。

当我使用explain运行查询时，我看到以下结果：

1   SIMPLE  t1  index   test    test    777 NULL    99560   Using where; Using index
1   SIMPLE  t2  ref test    test    777 development.t1.date,development.t1.name,development.t1.country_id   1   Using index

我使用以下语句为两个表创建了索引：

ALTER TABLE `person` ADD INDEX `test` (`date`,`name`,`country_id`)
ALTER TABLE `temp_table` ADD INDEX `test` (`date`,`name`,`country_id`)

每个表中包含相同的100,000行，因此连接应返回100,000行。我假设这个查询是如此缓慢，因为在t1表上扫描的行数。如果我已应用索引，我不确定为什么会这样。任何帮助，将不胜感激。

Answer 1

具有相同的列并不保证1-1匹配，除非列的组合是唯一的。

尝试运行此查询：

select cnt, count(*)
from (select date,name, country_id, count(*) as cnt
      from person
      group by date,name, country_id
     ) t
group by cnt;

这将给出每个组合的计数。如果您只获得一行，cnt列中的“1”，那么您的查询应该没问题。如果你得到其他值，那么你实际上是乘以行数，这会导致你的性能问题。

编辑：

您的输出似乎是：

2564    37
2565    1
2566    1

这意味着三列的37种组合出现2,564次。正是这些在结果集中产生了2,564 * 2,564 * 37行（243,241,552行）。这是很多行，可能解释了为什么你的查询很慢。

Answer 2

连接乘以元组的数量。请尝试使用自然联接或组。

SELECT t1.`id` FROM `person` as t1 
 NATURAL JOIN `temp_table` as t2

我不知道mysql，但是应该在psql中工作，这应该是类似的。

MySQL Join Query非常慢

2 个答案: