快速概述,我已经制定了一个mysql查询,但需要优化性能。
我原来的帖子是here,但它很冷,我非常想要详细说明我试图实施的一些建议。所以它不是一个愚蠢的帖子,但它是相关的。
这是一个需要45秒加上的查询,第二个子查询中的group by确实减慢了速度。
SELECT * FROM
(
SELECT DISTINCT email,
title,
first_name,
last_name,
'chauntry' AS source,
post_code AS postcode
FROM chauntry
WHERE mailing_indicator = 1
) AS x
JOIN
(
SELECT email,
Avg(amount_paid) AS avg_paid,
Count(*) AS no_times_booked,
Count(DISTINCT( Date_format(added, '%M %Y') )) AS unique_months
FROM chauntry
WHERE added >= Now() - INTERVAL 1 year
GROUP BY email
) AS y
ON x.email = y.email
根据here的索引建议,我查看了一些索引的示例,并提出了以下内容
ALTER TABLE `chauntry`
ADD INDEX(`mailing_indicator`, `email`);
ALTER TABLE `chauntry`
ADD INDEX covering_index (`added`, `email`, `amount_paid`);
这对查询时间没有任何影响,我不知道我现在做的是什么甚至接近,直到现在我还没有必要使用索引。
建议欢迎如何正确索引我的表或如何修改查询。
答案 0 :(得分:0)
出于好奇,此查询是否符合您的要求?
SELECT email, title, first_name, last_name, 'chauntry' AS source,
post_code AS postcode,
Avg(amount_paid) AS avg_paid,
Count(*) AS no_times_booked,
Count(DISTINCT( Date_format(added, '%M %Y') )) AS unique_months
FROM chauntry
WHERE added >= Now() - INTERVAL 1 year
GROUP BY email, title, first_name, last_name, post_code
HAVING SUM(mailing_indicator = 1) > 0;
它似乎遵循与查询相同的逻辑,除了邮件指示符需要在过去一年中设置。
答案 1 :(得分:0)
为什么在同一个表的子选择中使用JOIN
?
我会试试这个:
SELECT email,
title,
first_name,
last_name,
'chauntry' AS source,
post_code AS postcode
Avg(amount_paid) AS avg_paid,
Count(*) AS no_times_booked,
Count(DISTINCT( Date_format(added, '%M %Y') )) AS unique_months
FROM chauntry
WHERE
mailing_indicator = 1 and
added >= Now() - INTERVAL 1 year
GROUP BY email
此外,我认为您不需要任何带有此类查询的索引,可能在added
和email
,但您已经添加了这些索引。
答案 2 :(得分:0)
次要游戏。
amount_paid的平均值是最大的问题。如果你准备忍受这个数字不准确的可能性,那么你可以平均amount_paid字段的不同值。在某些情况下,这将给出错误的价值(即,如果您有100次预订,99美元为1美元,1美元为100美元,平均价格为50.50美元而不是1.99美元),但如果支付的金额从未重复,则可以接受
否则你可以使用表的连接来对抗自身。要获取no_times_booked,您可以计算表的DISTINCT唯一标识符(我在这里假设了id)。
SELECT c1.email,
c1.title,
c1.first_name,
c1.last_name,
'chauntry' AS source,
c1.post_code AS postcode
Avg(DISTINCT c2.amount_paid) AS avg_paid,
Count(DISTINCT c2.id) AS no_times_booked,
Count(DISTINCT( Date_format(c2.added, '%M %Y') )) AS unique_months
FROM chauntry c1
INNER JOIN chauntry c2
ON c1.email = c2.email
WHERE c1.mailing_indicator = 1
AND c2.added >= Now() - INTERVAL 1 year
GROUP BY c1.email,
c1.title,
c1.first_name,
c1.last_name,
source,
c1.post_code