在我开发的软件中,一个汽车销售软件,其中一个部分的议程包含用户的所有约会。
这个部分加载日常和正常使用议程,数千行,但当议程表达到100万行时开始变得非常慢。
结构:
1)主表
CREATE TABLE IF NOT EXISTS `agenda` (
`id_agenda` int(11) NOT NULL AUTO_INCREMENT,
`id_user` int(11) NOT NULL DEFAULT '0',
`id_agency` int(11) NOT NULL DEFAULT '0',
`id_customer` int(11) DEFAULT NULL,
`id_car` int(11) DEFAULT NULL,
`id_owner` int(11) DEFAULT NULL,
`type` int(11) NOT NULL DEFAULT '8',
`title` varchar(255) NOT NULL DEFAULT '',
`text` text NOT NULL,
`start_day` date NOT NULL DEFAULT '0000-00-00',
`end_day` date NOT NULL DEFAULT '0000-00-00',
`start_hour` time NOT NULL DEFAULT '00:00:00',
`end_hour` time NOT NULL DEFAULT '00:00:00'
PRIMARY KEY (`id_agenda`),
KEY `start_day` (`start_day`),
KEY `id_customer` (`id_customer`),
KEY `id_car` (`id_car`),
KEY `id_user` (`id_user`),
KEY `id_owner` (`id_owner`),
KEY `type` (`type`),
KEY `id_agency` (`id_agency`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ;
2)辅助表
CREATE TABLE IF NOT EXISTS `agenda_cars` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`id_agenda` int(11) NOT NULL,
`id_car` int(11) NOT NULL,
`id_owner` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `id_agenda` (`id_agenda`),
KEY `id_car` (`id_car`),
KEY `id_owner` (`id_owner`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
查询:
SELECT a.id_agenda
FROM agenda as a
LEFT JOIN agenda_cars as agc on agc.id_agenda = a.id_agenda
WHERE
(a.id_customer = '22' OR (a.id_owner = '22' OR agc.id_owner = '22' ))
GROUP BY a.id_agenda
ORDER BY a.start_day, a.start_hour
说明:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE a index PRIMARY PRIMARY 4 NULL 1051987 Using temporary; Using filesort
1 SIMPLE agc ref id_agenda id_agenda 4 db.a.id_agenda 1 Using where
查询达到10秒结束,ID为22,但其他ID也可以达到20秒,这只是为了查询,加载所有在网页上当然需要更多的时间。
我不明白为什么需要这么长时间来获取数据,我认为索引配置正确并且查询非常简单,为什么呢?
数据太多了?
我已经用这种方式解决了:
SELECT a.id_agenda
FROM
(
SELECT id_agenda
FROM agenda
WHERE (id_customer = '22' OR id_owner = '22' )
UNION
SELECT id_agenda
FROM agenda_cars
WHERE id_owner = '22'
) as at
INNER JOIN agenda as a on a.id_agenda = at.id_agenda
GROUP BY a.id_agenda
ORDER BY a.start_day, a.start_hour
此版本的查询比之前的版本快十倍......但为什么呢?
Rick James解决方案后更新:
查询建议
SELECT a.id_agenda
FROM
(
SELECT id_agenda FROM agenda WHERE id_customer = '22'
UNION DISTINCT
SELECT id_agenda FROM agenda WHERE id_owner = '22'
UNION DISTINCT
SELECT id_agenda FROM agenda_cars WHERE id_owner = '22'
) as at
INNER JOIN agenda as a ON a.id_agenda = at.id_agenda
ORDER BY a.start_datetime;
结果:共279次,0.0111秒
说明:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 366 Using temporary; Using filesort
1 PRIMARY a eq_ref PRIMARY PRIMARY 4 at.id_agenda 1 NULL
2 DERIVED agenda ref id_customer id_customer 5 const 1 Using index
3 UNION agenda ref id_owner id_owner 5 const 114 Using index
4 UNION agenda_cars ref id_owner id_owner 4 const 250 NULL
NULL UNION RESULT <union2,3,4> ALL NULL NULL NULL NULL NULL Using temporary
答案 0 :(得分:3)
在深入研究可以做的事情之前,让我列出一些我看到的reg标志。
OR
难以优化WHERE
上的过滤(JOINed
)很难优化。GROUP BY x ORDER BY z
表示两次传递数据,通常是2个临时表和filesorts。LEFT
吗?它说“可能缺少正确的表格(agc
),在这种情况下提供NULLs
”。(你可能无法摆脱所有的危险信号。)
架构中的红旗:
DATE
和TIME
作为单独的列 - 通常会导致笨拙的查询。好的,那些不在我的肩膀上,现在要研究查询...(哦,谢谢你提供了CREATEs
和EXPLAIN
!)
ON
意味着议程:agenda_cars之间存在1:很多关系。这是对的吗?
id_owner
和id_car
在两个表中,但未包含在ON
中;怎么了?
(这是你最后一个问题的答案。)为什么要GROUP BY
?我看不到聚合。我猜想1:多关系导致多行,你需要重复删除吗?如需重复数据删除,请使用DISTINCT
。但是,真正的解决方案是避免“膨胀(JOIN
) - 放气(GROUP BY
)”综合症。你的子查询是一个很好的开端。
滚动上面的一些评论,加上更多:
SELECT a.id_agenda
FROM
(
SELECT id_agenda FROM agenda WHERE id_customer = '22'
UNION DISTINCT
SELECT id_agenda FROM agenda WHERE id_owner = '22'
UNION DISTINCT
SELECT id_agenda FROM agenda_cars WHERE id_owner = '22'
) as at
INNER JOIN agenda as a ON a.id_agenda = at.id_agenda
ORDER BY a.start_datetime;
注意:
OR
UNION DISTINCT
明确表示需要重复。GROUP BY
而不使用SELECT DISTINCT
; UNION DISTINCT
处理需求。(id_customer)
,(id_owner)
(在两个表上)和PRIMARY KEY(id_agenda)
。ORDER BY
,将会有一个不可避免的tmp表和文件排序,但它不会有一百万行。DATETIME
;如果你有充分的理由将其拆分,请改回来。我再给你10倍了吗?我有充分的解释吗?
哦,还有一件事......
此查询返回按其未返回的内容(日期+时间)排序的ID列表。你会对id做什么?如果您在另一个表中将其用作子查询,则优化程序有权丢弃ORDER BY
。只是警告你。