Trying to optimize a query that is crippling performance

时间:2016-01-12 11:23:05

标签: mysql sql query-optimization

Having terrible problems with the below query crippling performance of a script, with it often taking 10-30 seconds to complete. Wondering if anyone might have any optimization advice, specific or general is fine--I'm no querysmith.

Tinkering with types and indexes as well as the query itself is certainly doable.

SELECT DISTINCT t1.column_1, t1.column_2
FROM TABLE_1 AS t1
LEFT JOIN TABLE_1 AS t2
    ON t1.column_1 = t2.column_1
    AND t1.column_3 = t2.column_3
    AND t2.int_value = 1
    AND t2.column_4 = 'test_string_1'
WHERE t1.column_5 = 'text_string_2';

Size of TABLE_1 ~ 6 million rows

Structure of TABLE_1:

+--------------+--------------+------+-----+-------------------+-----------------------------+
| Field        | Type         | Null | Key | Default           | Extra                       |
+--------------+--------------+------+-----+-------------------+-----------------------------+
| id           | int(11)      | NO   | PRI | NULL              | auto_increment              |
| column_1     | bigint(12)   | YES  | MUL | NULL              |                             |
| column_4     | varchar(100) | YES  | MUL | NULL              |                             |
| column_5     | varchar(140) | YES  |     | NULL              |                             |
| column_2     | varchar(15)  | YES  | MUL | NULL              |                             |
| int_value    | int(1)       | YES  | MUL | NULL              |                             |
| last_updated | timestamp    | NO   | MUL | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+--------------+--------------+------+-----+-------------------+-----------------------------+

2 个答案:

答案 0 :(得分:1)

对于此查询,您需要正确的索引:

SELECT DISTINCT t1.column_1, t1.column_2
FROM TABLE_1 AS t1 LEFT JOIN
     TABLE_1 AS t2
     ON t1.column_1 = t2.column_1 AND
        t1.column_3 = t2.column_3 AND
        t2.int_value = 1 AND
        t2.column_4 = 'test_string_1'
WHERE t1.column_5 = 'text_string_2';

如上所述,这将是:TABLE_1(column_5, column_1, column3, column_2)TABLE_2(column_1, column_3, int_value, column_4)

但是,我认为查询可以大大简化。无论条件是否匹配,LEFT JOIN都会保留第一个表中的所有行。 WHERE条件仅在第一个表上,而列仅来自第一个表,因此查询应等效于:

SELECT DISTINCT t1.column_1, t1.column_2
FROM TABLE_1 AS t1 
WHERE t1.column_5 = 'text_string_2';

DISTINCT可能不是必需的。但这个简化版本的最佳索引是TABLE_1(column_5, column_1, column_2)

注意:如果您在问题中写入查询时出错,请提出另一个问题,而不是使此答案无效。

答案 1 :(得分:0)

摆脱DISTINCT并尝试HAVING语句: 也许这会更快:

SELECT t1.column_1, t1.column_2
FROM TABLE_1 AS t1
   LEFT JOIN TABLE_1 AS t2
   ON t1.column_1 = t2.column_1
   AND t1.column_3 = t2.column_3

HAVING t1.column_5 = 'text_string_2' AND t2.column_4 = 'test_string_1'   AND t2.int_value = 1  ;