我有一份报告正在尝试找出,但是我想在SQL语句中完成所有操作,而无需遍历脚本中的一堆数据来完成它。
我有一个结构如下的表:
CREATE TABLE `batch_item` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`record_id` int(11) DEFAULT NULL,
`created` DATE NOT NULL,
PRIMARY KEY (`id`),
KEY `record_id` (`record_id`)
);
日期字段始终为YEAR-MONTH-01。数据如下所示:
+------+-----------+------------+
| id | record_id | created |
+------+-----------+------------+
| 1 | 1 | 2019-01-01 |
| 2 | 2 | 2019-01-01 |
| 3 | 3 | 2019-01-01 |
| 4 | 1 | 2019-02-01 |
| 5 | 2 | 2019-02-01 |
| 6 | 1 | 2019-03-01 |
| 7 | 3 | 2019-03-01 |
| 8 | 1 | 2019-04-01 |
| 9 | 2 | 2019-04-01 |
+------+-----------+------------+
因此,我无需创建循环脚本就可以尝试为每条记录找到连续几个月的AVG数。上面数据的示例为:
Record_id 1 would have a avg of 4 months.
Record_id 2 would be 1.5
Record_id 3 would be 1
我可以编写一个脚本来遍历所有记录。我只是想避免那样。
答案 0 :(得分:1)
这是一个孤岛问题。您只需要对行进行枚举即可使其工作。在MySQL 8+中,您将使用row_number()
,但可以在此处使用全局枚举:
select record_id, min(created) as min_created, max(created) as max_created, count(*) as num_months
from (select bi.*, (created - interval n month) as grp
from (select bi.*, (@rn := @rn + 1) as n -- generate some numbers
from batch_item bi cross join
(select @rn := 0) params
order by bi.record_id, bi.month
) bi
) bi
group by record_id, grp;
请注意,使用row_number()
时通常会partition by record_id
。但是,如果数字按正确的顺序创建,则不必这样做。
上面的查询获取孤岛。为了获得最终结果,您需要再进行一次聚合:
select record_id, avg(num_months)
from (select record_id, min(created) as min_created, max(created) as max_created, count(*) as num_months
from (select bi.*, (created - interval n month) as grp
from (select bi.*, (@rn := @rn + 1) as n -- generate some numbers
from batch_item bi cross join
(select @rn := 0) params
order by bi.record_id, bi.month
) bi
) bi
group by record_id, grp
) bi
group by record_id;
答案 1 :(得分:0)
这不是经过测试的解决方案。它应该可以在MySQL 8.x中进行细微调整,因为我不记得MySQL中的日期算术了:
with
a as ( -- the last row of each island
select *
from batch_item
where lead(created) over(partition by record_id order by created) is null
or lead(created) over(partition by record_id order by created)
> created + 1 month -- Fix the date arithmetic here!
),
e as ( -- each row, now with the last row of its island
select b.id, b.record_id, min(a.last_created) as end_created
from batch_item b
join a on b.record_id = a.record_id and b.created <= a.created
group by b.id, b.record_id
),
m as ( -- each island with the number of months it has
select
record_id, end_created, count(*) as months
from e
group by record_id, end_created
)
select -- the average length of islands for each record_id
record_id, avg(months) as avg_months
from m
group by record_id