慢的mysql查询在6个月之前的许多最大日期连接表一到多

时间:2013-09-29 22:24:23

标签: mysql performance date join max

我正在更新现有系统,需要坚持使用已经使用的一些代码。 每个main_id的表n中可能没有或有许多记录。 主要约有4万条记录,n。约有330条记录。

我只需要选择main中过去6个月内没有n.date的记录。

不幸的是,我尝试过的每一种方式都很慢。

main.main_id
main.field1
main.field2
main.field3

n.n_id
n.main_id
n.date
n.field1
n.field2
n.field3

查询的格式为

SELECT distinct(main.main_id) FROM main LEFT JOIN...

我尝试将子查询放在各种地方,也包括视图,临时表,添加索引,到目前为止还没有任何东西使它接近合理的速度。

不幸的是,我没有列出我迄今为止所尝试的一些事情,因为我希望我能让它工作,所以没有注意到它们现在已经很晚了!

我怀疑是否直接从n运行查询。表可能会更快,但这需要大量重写。 查询还有很多其他元素,但是在加入表的情况下,它会在两秒内完成,但没有这个。

这是最简单的 - 通常更多WHERE子句和JOIN。

EXPLAIN 
SELECT distinct(`main`.`main_id`),`morefields`,`morefields2`
FROM main LEFT JOIN anothertable ON anothertable    anothertable.a_n = main.a 
LEFT JOIN anothertable2 ON anothertable2.g_n = main.CG 
LEFT JOIN anothertable3 ON anothertable3.t_n = main.t 
LEFT JOIN (SELECT max(DateTS) as note_date, main_id FROM n GROUP BY main_id) n_sub ON main.main_id=n_sub.main_id
WHERE main.deleted = '0' 
AND n_sub.note_date < DATE_SUB(now(), INTERVAL 6 MONTH)
ORDER BY main.morefields ASC LIMIT 0, 30;
+----+-------------+-------------+--------+---------------+---------+---------+----------------------------+--------+----------------------------------------------+
| id | select_type | table       | type   | possible_keys | key     | key_len | ref                        | rows   | Extra                                        |
+----+-------------+-------------+--------+---------------+---------+---------+----------------------------+--------+----------------------------------------------+
|  1 | PRIMARY     | <derived2>  | ALL    | NULL          | NULL    | NULL    | NULL                       |  40324 | Using where; Using temporary; Using filesort |
|  1 | PRIMARY     | main    | eq_ref | PRIMARY       | PRIMARY | 4       | n_sub.cust_no          |      1 | Using where                                  |
|  1 | PRIMARY     | anothertable       | eq_ref | PRIMARY       | PRIMARY | 4       | db.maij.area          |      1 |                                              |
|  1 | PRIMARY     | anothertable2 | eq_ref | PRIMARY       | PRIMARY | 4       | db.main.CG |      1 |                                              |
|  1 | PRIMARY     | anothertable3  | eq_ref | PRIMARY       | PRIMARY | 4       | db.main.t          |      1 |                                              |
|  2 | DERIVED     | n  | index  | NULL          | main_id | 4       | NULL                       | 285961 |                                              |
+----+-------------+-------------+--------+---------------+---------+---------+----------------------------+--------+----------------------------------------------+
6 rows in set (30.25 sec)

2 个答案:

答案 0 :(得分:0)

您应该能够完全简单地获取过去6个月内没有日期记录的ID列表,这些ID基于过去6个月内的左连接查找FOR活动,并为该连接的NULL应用WHERE子句。 / p>

SELECT 
      m.main_id,
      m.a,
      m.CG,
      m.t,
      m.morefields,
      m.morefields2,
   from
      main m
         left join n
            ON m.main_id = n.main_id
            and n.note_date > date_sub( now(), interval 6 month )
   where
          m.deleted = '0'
      AND n.main_id is null
   order by 
      m.morefields asc
   limit
      0, 30

现在,你有其他的连接,你可能也想要那些字段。如果是这样,我将包装上面并使用THAT作为连接基础...我使用别名“PQ”来识别连接其余部分的“PreQuery”。

select 
      PQ.*,
      A_T1.SomeField(s),
      A_T2.SomeField(s),
      A_T3.SomeField(s)
   from 
      ( entire first query ) as PQ
         left join anothertable A_T1
            on PQ.a = A_T1.a_n
         left join anothertable2 A_T2
            on PQ.CG = A_T2.g_n
         left join anothertable2 A_T3
            on PQ.CG = A_T3.t

由于内部查询的限制为30,我们不需要再次重新应用限制(除非“另一个”表会导致一些笛卡尔结果并且每个主要ID产生更多记录。)

为了明显隐藏实际数据/上下文,我只猜测你真正加入的实际列。

答案 1 :(得分:0)

基于此:

+----+-------------+-------------+--------+---------------+---------+---------+--------------+--------+----------------------------------------------+
| id | select_type | table       | type   | possible_keys | key     | key_len | ref          | rows   | Extra                                        |
+----+-------------+-------------+--------+---------------+---------+---------+--------------+--------+----------------------------------------------+
|  1 | PRIMARY     |<derived2>   | ALL    | NULL          | NULL    | NULL    | NULL         |  40324 | Using where; Using temporary; Using filesort |
|  1 | PRIMARY     |main         | eq_ref | PRIMARY       | PRIMARY | 4       | n_sub.cust_no|      1 | Using where                                  |
|  1 | PRIMARY     |anothertable | eq_ref | PRIMARY       | PRIMARY | 4       | db.maij.area |      1 |                                              |
|  1 | PRIMARY     |anothertable2| eq_ref | PRIMARY       | PRIMARY | 4       | db.main.CG   |      1 |                                              |
|  1 | PRIMARY     |anothertable3| eq_ref | PRIMARY       | PRIMARY | 4       | db.main.t    |      1 |                                              |
|  2 | DERIVED     |n            | index  | NULL          | main_id | 4       | NULL         | 285961 |                                              |
+----+-------------+-------------+--------+---------------+---------+---------+--------------+--------+----------------------------------------------+

...我觉得你的索引存在问题。例如,第一行表示它正在扫描40,324个完整行,用于数据。更糟糕的是,看起来没有使用索引(key列),因为查询中没有指定索引(possible_keys)。

尝试一下或类似的东西(除非我弄错了),但确保在对数据库的备份副本上进行尝试之后再对实际数据库进行任何更改:< / p>

ALTER TABLE `main` ADD INDEX ( `main_id` )

修改

如果这没有帮助,我的下一个建议是尝试改变这一行:

LEFT JOIN (SELECT max(DateTS) as note_date, main_id FROM n GROUP BY main_id) n_sub ON main.main_id=n_sub.main_id

对于这样的事情:

LEFT JOIN (SELECT max(DateTS) as note_date, main_id FROM n GROUP BY main_id) n_sub ON main.main_id=n_sub.main_id AND n_sub.note_date < DATE_SUB(now(), INTERVAL 6 MONTH)

这应该允许你完全删除这一行:

AND n_sub.note_date < DATE_SUB(now(), INTERVAL 6 MONTH)

另一种可能性是尝试这样做:

SELECT distinct(`main`.`main_id`),`morefields`,`morefields2`
FROM main
-- Maybe change the next line to an INNER JOIN..?
LEFT JOIN n ON main.main_id = n.main_id
LEFT JOIN anothertable ON anothertable.a_n = main.a 
LEFT JOIN anothertable2 ON anothertable2.g_n = main.CG 
LEFT JOIN anothertable3 ON anothertable3.t_n = main.t 
WHERE main.deleted = '0'
GROUP BY main.main_id HAVING MAX(n.DateTS) < DATE_SUB(NOW(), INTERVAL 6 MONTH)
ORDER BY main.morefields ASC LIMIT 0, 30;