我想按给定的日期范围计算每个评分组的数量。我写了下面的查询,它很完美:
SELECT c.day,
(SELECT COUNT(DISTINCT user_id) FROM ratings r WHERE DATE(r.created_at) = c.day AND r.rating = 1 AND r.campaign_id = 2) AS rating1s,
(SELECT COUNT(DISTINCT user_id) FROM ratings r WHERE DATE(r.created_at) = c.day AND r.rating = 2 AND r.campaign_id = 2) AS rating2s,
(SELECT COUNT(DISTINCT user_id) FROM ratings r WHERE DATE(r.created_at) = c.day AND r.rating = 3 AND r.campaign_id = 2) AS rating3s,
(SELECT COUNT(DISTINCT user_id) FROM ratings r WHERE DATE(r.created_at) = c.day AND r.rating = 4 AND r.campaign_id = 2) AS rating4s,
(SELECT COUNT(DISTINCT user_id) FROM ratings r WHERE DATE(r.created_at) = c.day AND r.rating = 5 AND r.campaign_id = 2) AS rating5s
FROM calendar c
WHERE c.day >= '2018-08-01'
GROUP BY c.day
ORDER BY c.day
LIMIT 0, 31
但是这不是一种优化的方法,因为有5个子查询,而我的本地主机上的查询花费了将近2分钟的时间,如何优化此查询?随附示例输出,我需要相同的输出。
答案 0 :(得分:1)
您可以将其重新定义为条件聚合:
SELECT DATE(r.created_at),
COUNT(DISTINCT CASE WHEN r.rating = 1 THEN r.user_id END) as raging_1,
COUNT(DISTINCT CASE WHEN r.rating = 2 THEN r.user_id END) as raging_2,
COUNT(DISTINCT CASE WHEN r.rating = 3 THEN r.user_id END) as raging_3,
COUNT(DISTINCT CASE WHEN r.rating = 4 THEN r.user_id END) as raging_4,
COUNT(DISTINCT CASE WHEN r.rating = 5 THEN r.user_id END) as raging_5
FROM ratings r
WHERE r.campaign_id = 2 AND
r.created_at >= '2018-08-01'
GROUP BY DATE(r.created_at);
COUNT(DISTINCT)
可能很昂贵。如果可以,将其删除。
否则,一次执行DISTINCT
可能会更快:
SELECT dte,
SUM( r.rating = 1 ) as raging_1,
SUM( r.rating = 2 ) as raging_2,
SUM( r.rating = 3 ) as raging_3,
SUM( r.rating = 4 ) as raging_4,
SUM( r.rating = 5 ) as raging_5
FROM (SELECT DISTINCT user_id, rating, DATE(r.created_at) as dte
FROM ratings r
WHERE r.campaign_id = 2 AND
r.created_at >= '2018-08-01'
) urd
GROUP BY dte;
这将返回每天至少具有一个评分的行。如果有些日子全为零,那么您将需要某种外部联接。这几乎没有增加任何性能,因此,如果上述解决方案之一有效,就可以解决。
答案 1 :(得分:0)
这是我使用@Gordon的答案进行的查询:
SELECT DATE(r.created_at),
COUNT(
DISTINCT
CASE
WHEN r.rating = 1
THEN user_id
ELSE 0
END
) as rating1s,
COUNT(
DISTINCT
CASE
WHEN r.rating = 2
THEN user_id
ELSE 0
END
) as rating2s,
COUNT(
DISTINCT
CASE
WHEN r.rating = 3
THEN user_id
ELSE 0
END
) as rating3s,
COUNT(
DISTINCT
CASE
WHEN r.rating = 4
THEN user_id
ELSE 0
END
) as rating4s,
COUNT(
DISTINCT
CASE
WHEN r.rating = 5
THEN user_id
ELSE 0
END
) as rating5s
FROM ratings r
WHERE r.campaign_id = 2 AND
DATE(r.created_at) >= '2018-08-01'
GROUP BY DATE(r.created_at)
这仍然没有优化,但是比我最初的解决方案要好得多。