Question

在我开发的软件中，一个汽车销售软件，其中一个部分的议程包含用户的所有约会。

这个部分加载日常和正常使用议程，数千行，但当议程表达到100万行时开始变得非常慢。

结构：

1）主表

CREATE TABLE IF NOT EXISTS `agenda` (
  `id_agenda` int(11) NOT NULL AUTO_INCREMENT,
  `id_user` int(11) NOT NULL DEFAULT '0',
  `id_agency` int(11) NOT NULL DEFAULT '0',
  `id_customer` int(11) DEFAULT NULL,
  `id_car` int(11) DEFAULT NULL,
  `id_owner` int(11) DEFAULT NULL,
  `type` int(11) NOT NULL DEFAULT '8',
  `title` varchar(255) NOT NULL DEFAULT '',
  `text` text NOT NULL,
  `start_day` date NOT NULL DEFAULT '0000-00-00',
  `end_day` date NOT NULL DEFAULT '0000-00-00',
  `start_hour` time NOT NULL DEFAULT '00:00:00',
  `end_hour` time NOT NULL DEFAULT '00:00:00'
  PRIMARY KEY (`id_agenda`),
  KEY `start_day` (`start_day`),
  KEY `id_customer` (`id_customer`),
  KEY `id_car` (`id_car`),
  KEY `id_user` (`id_user`),
  KEY `id_owner` (`id_owner`),
  KEY `type` (`type`),
  KEY `id_agency` (`id_agency`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1 ;

2）辅助表

CREATE TABLE IF NOT EXISTS `agenda_cars` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `id_agenda` int(11) NOT NULL,
  `id_car` int(11) NOT NULL,
  `id_owner` int(11) NOT NULL,
  PRIMARY KEY (`id`),
  KEY `id_agenda` (`id_agenda`),
  KEY `id_car` (`id_car`),
  KEY `id_owner` (`id_owner`)
) ENGINE=InnoDB  DEFAULT CHARSET=latin1

查询：

SELECT a.id_agenda
FROM agenda as a
LEFT JOIN agenda_cars as agc on agc.id_agenda = a.id_agenda
WHERE 
(a.id_customer = '22'  OR (a.id_owner = '22' OR agc.id_owner = '22' ))
GROUP BY a.id_agenda
ORDER BY a.start_day,  a.start_hour

说明：

id  select_type table   type    possible_keys  key         key_len    ref   rows    Extra   
1   SIMPLE       a      index   PRIMARY        PRIMARY      4          NULL 1051987 Using temporary; Using filesort
1   SIMPLE       agc    ref     id_agenda      id_agenda    4   db.a.id_agenda  1   Using where

查询达到10秒结束，ID为22，但其他ID也可以达到20秒，这只是为了查询，加载所有在网页上当然需要更多的时间。

我不明白为什么需要这么长时间来获取数据，我认为索引配置正确并且查询非常简单，为什么呢？

数据太多了？

我已经用这种方式解决了：

SELECT a.id_agenda
  FROM
      (
              SELECT id_agenda
              FROM agenda  
              WHERE (id_customer = '22'  OR  id_owner = '22' )
          UNION
              SELECT id_agenda
              FROM  agenda_cars
              WHERE id_owner = '22'
      )  as at
INNER JOIN agenda as a on a.id_agenda = at.id_agenda
GROUP BY a.id_agenda
ORDER BY a.start_day,  a.start_hour

此版本的查询比之前的版本快十倍......但为什么呢？

感谢所有人想为解决我的疑惑做出贡献！

Rick James解决方案后更新：

查询建议

SELECT  a.id_agenda
    FROM  
    (
        SELECT  id_agenda  FROM  agenda  WHERE  id_customer = '22'
        UNION DISTINCT
        SELECT  id_agenda  FROM  agenda  WHERE  id_owner = '22'
        UNION DISTINCT
        SELECT  id_agenda  FROM  agenda_cars  WHERE  id_owner = '22'
    ) as at
    INNER JOIN  agenda as a  ON a.id_agenda = at.id_agenda
    ORDER BY  a.start_datetime;

结果：共279次，0.0111秒

说明：

id      select_type     table           type    possible_keys   key         key_len     ref             rows        Extra
1       PRIMARY         <derived2>      ALL     NULL            NULL        NULL        NULL            366         Using temporary; Using filesort
1       PRIMARY         a               eq_ref  PRIMARY         PRIMARY     4           at.id_agenda    1           NULL
2       DERIVED         agenda          ref     id_customer     id_customer 5           const           1           Using index
3       UNION           agenda          ref     id_owner        id_owner    5           const           114         Using index
4       UNION           agenda_cars     ref     id_owner        id_owner    4           const           250         NULL
NULL    UNION RESULT    <union2,3,4>    ALL     NULL            NULL        NULL        NULL            NULL        Using temporary

Answer 1

在深入研究可以做的事情之前，让我列出一些我看到的reg标志。

OR难以优化
多个表WHERE上的过滤（JOINed）很难优化。
GROUP BY x ORDER BY z表示两次传递数据，通常是2个临时表和filesorts。
你的意思是LEFT吗？它说“可能缺少正确的表格（agc），在这种情况下提供NULLs”。

（你可能无法摆脱所有的危险信号。）

架构中的红旗：

为每列编制索引 - 通常无用
只有单列索引 - “复合”索引通常会有所帮助。
DATE和TIME作为单独的列 - 通常会导致笨拙的查询。

好的，那些不在我的肩膀上，现在要研究查询...（哦，谢谢你提供了CREATEs和EXPLAIN！）

ON意味着议程：agenda_cars之间存在1：很多关系。这是对的吗？

id_owner和id_car在两个表中，但未包含在ON中;怎么了？

（这是你最后一个问题的答案。）为什么要GROUP BY？我看不到聚合。我猜想1：多关系导致多行，你需要重复删除吗？如需重复数据删除，请使用DISTINCT。但是，真正的解决方案是避免“膨胀（JOIN） - 放气（GROUP BY）”综合症。你的子查询是一个很好的开端。

滚动上面的一些评论，加上更多：

SELECT  a.id_agenda
    FROM  
    (
        SELECT  id_agenda  FROM  agenda  WHERE  id_customer = '22'
        UNION DISTINCT
        SELECT  id_agenda  FROM  agenda  WHERE  id_owner = '22'
        UNION DISTINCT
        SELECT  id_agenda  FROM  agenda_cars  WHERE  id_owner = '22'
    ) as at
    INNER JOIN  agenda as a  ON a.id_agenda = at.id_agenda
    ORDER BY  a.start_datetime;

注意：

摆脱了其他OR
明确UNION DISTINCT明确表示需要重复。
折腾GROUP BY而不使用SELECT DISTINCT; UNION DISTINCT处理需求。
您有4个必要的索引（每个子查询一个）：(id_customer)，(id_owner)（在两个表上）和PRIMARY KEY(id_agenda)。
索引是“涵盖所有子查询的索引 - 额外奖励。
对于ORDER BY，将会有一个不可避免的tmp表和文件排序，但它不会有一百万行。
（这次不需要复合索引。）
我改为DATETIME;如果你有充分的理由将其拆分，请改回来。

我再给你10倍了吗？我有充分的解释吗？

哦，还有一件事......

此查询返回按其未返回的内容（日期+时间）排序的ID列表。你会对id做什么？如果您在另一个表中将其用作子查询，则优化程序有权丢弃ORDER BY。只是警告你。

Mysql查询没有优化而且速度很慢，但为什么呢？

感谢所有人想为解决我的疑惑做出贡献！

1 个答案: