我有两个桌子。一个是压光机,第二个是final_registration,如下所示:
*--------------------------*
| calender_id | datefield |
*--------------------------*
| 1 | 2015-07-13 |
| 2 | 2015-07-14 |
| 3 | 2015-07-15 |
| 4 | 2015-07-16 |
| - | ---------- |
| - | ---------- |
| - | ---------- |
| 5647 | 2030-12-28 |
| 5648 | 2030-12-29 |
| 5649 | 2030-12-30 |
| 5650 | 2030-12-31 |
*--------------------------*
所以我的第一张桌子大约有5650条记录。
现在第二张表是我的注册表,我在其中存储带有预订日期的用户信息
*--------------------------------------------------*
| id | name | booking_date | ticket_status |
*--------------------------------------------------*
| 1 | RAM | 2018-12-24 12:54:53 | active |
| 2 | RAO | 2018-12-24 12:54:53 | active |
| 3 | RAT | 2018-12-24 12:54:53 | active |
| 4 | PAL | 2018-11-24 12:54:53 | active |
| 5 | TOM | 2018-10-24 12:54:53 | active |
| 6 | SAM | 2018-10-24 12:54:53 | active |
| 7 | RAT | 2018-09-24 12:54:53 | active |
| 8 | MAT | 2019-12-24 12:54:53 | active |
| 9 | NOT | 2019-12-24 12:54:53 | active |
| 10 | RAM | 2019-12-24 12:54:53 | active |
*--------------------------------------------------*
现在我想统计一下2018年哪本书按月分拆的注册量。
| booking_date | countT |
| 2018-01 | 0 |
| 2018-02 | 0 |
| 2018-03 | 0 |
| 2018-04 | 0 |
| 2018-05 | 0 |
| 2018-06 | 0 |
| 2018-07 | 0 |
| 2018-08 | 0 |
| 2018-09 | 1 |
| 2018-10 | 2 |
| 2018-11 | 1 |
| 2018-12 | 3 |
我正在使用以下查询,我的查询给了我正确的输出,但是问题是执行时间。至少要花10分钟才能执行。
SELECT
DATE_FORMAT(calendar.datefield, '%Y-%m') AS booking_date,
COUNT(final_registration.booking_date) AS countT
FROM calendar
LEFT JOIN final_registration ON DATE_FORMAT(final_registration.booking_date, '%Y-%m-%d') =
DATE_FORMAT(calendar.datefield, '%Y-%m-%d')
AND final_registration.ticket_status IN ('active', 'cancelled')
WHERE DATE_FORMAT(calendar.datefield, '%Y') = $year
GROUP BY DATE_FORMAT(calendar.datefield, '%Y-%m')
答案 0 :(得分:2)
我会建议一个相关的子查询和索引:
SELECT yyyymm,
(SELECT COUNT(*)
FROM final_registration fr
WHERE fr.status IN ('active', 'cancelled') AND
fr.booking_date >= c.month_start AND
fr.booking_date < c.month_start + interval 1 month
) as countT
FROM (SELECT DATE_FORMAT(c.datefield, '%Y-%m') as yyyymm,
MIN(c.datefield) as month_start
FROM calendar c
WHERE YEAR(c.datefield) = ? -- PASS IN AS PARAMETER!!!
GROUP BY yyyymm
) c
ORDER BY c.yyyymm;
所需的索引位于final_registration(datefield, status)
上。
与您的查询相比,这有几个好处:
GROUP BY
。还请注意使用参数,而不是用字面值来修饰查询。
答案 1 :(得分:0)
我认为索引中存在这个问题。
只有在DATE_FORMAT(final_registration.booking_date, '%Y-%m-%d')
上具有基于函数的索引的情况下,您的查询才能很好地工作。我不确定您使用的是哪个版本的MySQL,它是否提供了这样的选项...
但是无论如何,我敢打赌,您在final_registration.booking_date
上有一个简单的索引。这样,您的join子句是不正确的,因为将不使用索引。因此,您不应将日期转换为字符以使索引起作用:
LEFT JOIN final_registration ON final_registration.booking_date = calendar.datefield
顺便说一句,WHERE子句也有此问题。总是比表字段更喜欢转换参数,例如:
WHERE calendar.datefield BETWEEN str_to_date(concat("01-01-", year(now())), "%d-%m-%Y") AND str_to_date(concat("31-12-", year(now())), "%d-%m-%Y")
答案 2 :(得分:0)
我建议在连接之前执行聚合,并实际计算出所需范围的开始和结束,并使用BETWEEN
;在您的where条件会破坏性能的情况下使用DATE_FORMAT()
甚至是YEAR()
之类的函数(如果您在调用它们的日期字段上没有索引)...。此外,请确保您在booking_date
上有一个索引。
SELECT c.booking_year, c.booking_month, bookingSummary.countT
FROM (
SELECT DISTINCT YEAR(datefield) AS booking_year, MONTH(datefield) AS booking_month
FROM calendar
WHERE c.datefield BETWEEN [firstdayofyear] AND [lastdayofyear]
) AS c
LEFT JOIN (
SELECT YEAR(booking_date) AS booking_year, MONTH(booking_date) AS booking_month
, COUNT(*) AS countT
FROM final_registration AS fr
WHERE fr.ticket_status IN ('active', 'cancelled')
AND fr.booking_date BETWEEN [firstdayofyear] AND [lastdayofyear]
GROUP BY booking_year, booking_month
) AS bookingSummary
USING (booking_year, booking_month)
;
如果您有支持CTE的MySQL版本,则甚至可以不使用日历表。可以使用将数字1-12生成为“ booking_month”的CTE(并在该字段中加入)。
WITH calendar_months AS (
SELECT 1 AS booking_month
UNION SELECT booking_month + 1 FROM calendar_months WHERE booking_month < 12
)
SELECT [year] AS booking_year, cm.booking_month, bookingSummary.countT
FROM calendar_months AS cm
LEFT JOIN (
SELECT MONTH(booking_date) AS booking_month
, COUNT(*) AS countT
FROM final_registration AS fr
WHERE fr.ticket_status IN ('active', 'cancelled')
AND fr.booking_date BETWEEN [firstdayofyear] AND [lastdayofyear]
GROUP BY booking_month
) AS bookingSummary
USING (booking_month)
;
注意:将我的[field]表示法视为参数的占位符;我建议我在第一个介绍CTE版本的原因之一是它需要维护的参数较少。