我遇到了一个MySQL查询速度慢的问题(MySQL 5+)。让我们想一下三个表:
customers:
- id_customer : int (PRIMARY)
- name : varchar(255)
customers_addresses:
- id_customers_addresses : int (PRIMARY)
- id_customer : int (INDEX)
- street : varchar(255)
- zipcode : varchar(255)
- city : varchar(255)
customers_contacts:
- id_customers_contacts : int (PRIMARY)
- id_customer : int (INDEX)
- type : varchar(255)
- value : varchar(255)
现在,我的目标是在一个查询中收集所有地址和联系信息,并为每个客户收集一行。我的第一次尝试是使用LEFT JOIN
s,因为有些客户没有任何地址和/或联系信息:
SELECT customers.id_customer,
customers.name,
X.contact AS contact,
Y.street,
Y.zipcode,
Y.city
FROM customers
LEFT JOIN
(
SELECT
GROUP_CONCAT( CONCAT( type, ': ', value ) SEPARATOR ', ' ) AS contact,
id_customer
FROM customers_contacts
GROUP BY id_customer
) AS X
ON X.id_customer = customers.id_customer
LEFT JOIN
(
SELECT
GROUP_CONCAT(street SEPARATOR '<br>' ) AS street,
GROUP_CONCAT(zipcode SEPARATOR '<br>' ) AS zipcode,
GROUP_CONCAT(city SEPARATOR '<br>' ) AS city,
id_customer
FROM customers_addresses
GROUP BY id_customer
) AS Y
ON Y.id_customer = customers.id_customer
WHERE Y.street LIKE '%Avenue%'
ORDER BY customers.name DESC
LIMIT 0, 20
此查询需要130秒才能完成(每个表中约有7000个条目),这远远不够。
预先EXPLAIN EXTENDED
给出:
id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY customers ref name name 3 const 4334 100.00 Using where; Using temporary; Using filesort
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 7793 100.00
1 PRIMARY <derived3> ALL NULL NULL NULL NULL 8580 100.00 Using where
3 DERIVED customers_addresses index NULL id_customer 5 NULL 8651 100.00
2 DERIVED customers_contacts index NULL id_customer 4 NULL 9314 100.00
我读了一些stackoverflow帖子和MySQL文档。两人都说INNER JOIN
要快得多。我尝试使用LEFT JOIN
复制INNER JOIN
行为UNION ALL
:
SELECT customers.id_customer,
customers.name,
X.contact AS contact,
Y.street,
Y.zipcode,
Y.city
FROM customers
INNER JOIN
(
SELECT
GROUP_CONCAT( CONCAT( type, ': ', value ) SEPARATOR ', ' ) AS contact,
id_customer
FROM customers_contacts
GROUP BY id_customer
UNION ALL
SELECT
'' AS contact,
id_customer
FROM customers
WHERE id_customer NOT IN (SELECT DISTINCT id_customer FROM customers_contacts)
) AS X
ON X.id_customer = customers.id_customer
INNER JOIN
(
SELECT
GROUP_CONCAT(street SEPARATOR '<br>' ) AS street,
GROUP_CONCAT(zipcode SEPARATOR '<br>' ) AS zipcode,
GROUP_CONCAT(city SEPARATOR '<br>' ) AS city,
id_customer
FROM customers_addresses
GROUP BY id_customer
UNION ALL
SELECT
'' AS street,
'' AS zipcode,
'' AS city,
id_customer
FROM customers
WHERE id_customer NOT IN (SELECT DISTINCT id_customer FROM customers_addresses)
) AS Y
ON Y.id_customer = customers.id_customer
WHERE Y.street LIKE '%Avenue%'
ORDER BY customers.name DESC
LIMIT 0, 20
此查询将性能提高了20秒。但是110秒仍然是不可接受的。
预先EXPLAIN EXTENDED
:
id select_type table type possible_keys key key_len ref rows filtered Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 8596 100.00 Using temporary; Using filesort
1 PRIMARY <derived5> ALL NULL NULL NULL NULL 8604 100.00 Using join buffer
1 PRIMARY customers eq_ref PRIMARY,name,name3 PRIMARY 4 Y.id_kunde 1 100.00 Using where
5 DERIVED customers_addresses index NULL id_kunde 5 NULL 8651 100.00
6 UNION customers index NULL name2 767 NULL 8677 100.00 Using where; Using index
7 DEPENDENT SUBQUERY customers_addresses index_subquery id_kunde id_kunde 5 func 2 100.00 Using index
NULL UNION RESULT <union5,6> ALL NULL NULL NULL NULL NULL NULL
2 DERIVED customers_contacts index NULL id_kunde 4 NULL 10411 100.00
3 UNION customers index NULL name2 767 NULL 8677 100.00 Using where; Using index
4 DEPENDENT SUBQUERY customers_contacts index_subquery id_kunde id_kunde 4 func 1 100.00 Using index
NULL UNION RESULT <union2,3> ALL NULL NULL NULL NULL NULL NULL
所以这是我的问题:如何改进其中一个查询和/或数据库表以获得超快响应?我不仅对解决方案感兴趣,而且对未来如何防止这种性能杀伤的策略感兴趣。
最好的问候。
答案 0 :(得分:3)
作为一般规则,适用于此处,您可以说以下内容:
每当您使用连接选择结果的查询(子查询)时,MySQL必须首先运行这些子查询,然后从结果中创建一个表。你这样做了两次,这意味着MySQL首先创建了2个表,只是在结果完成后删除它们。通过适当的MySQL内存管理,这可以在内存中完成。但是这些表是在没有索引的情况下创建的,因为MySQL不能神奇地确定哪个索引最适合这些派生表,并且因为它们通常是在内存中创建的,所以对它们的查询非常快(不如使用键的SELECT那么快)。 / p>
然后,当两个表完全相同时,MySQL必须将原始表连接到两个表,并根据您的标准动态创建需要过滤和排序的第三个表。
这是一个性能杀手。您的要求之一是每个客户只能产生一条线。这不是数据库如何保存信息,因此您需要在运行时为数据转换付出代价(您的GROUP_CONCAT语句)。我不能100%确定当前MySQL数据库引擎对UNION语句的作用,所以我不想对它们发表评论。
在可用键上使用简单的INNER JOIN,但当结果有多个地址时,为客户产生多行,您会发现性能很快就会跳跃。如果您不愿意将所有客户的结果和所有相关地址拆分为该层上的客户,您可以轻松地迭代编程语言层中的客户,一次为一个客户请求地址。
TL; DR:放弃您的要求或承担管理费用。