我有一个项目数据集。项目从头到尾更改状态,状态更改的日期记录在表中(表名为"事件" - 不是我的选择)。看起来像这样(简化):
Date Status
2015-06-01 Start
2015-06-03 Stage 2
2015-06-07 Stage 3
在任何给定的日期范围内(动态确定)我希望能够看到哪些项目处于哪种状态。但是,对数据使用BETWEEN或其他查询只会提取那段期间已更改的项目,而不是那些仍处于给定状态的项目。
我目前在Excel中创建了一个非常笨重的解决方案,它将行复制到状态更改日期之间的新行中,如下所示:
Date Status
2015-06-01 Project start
2015-06-02 Project start (copied)
2015-06-03 Stage 2
2015-06-04 Stage 2 (copied)
2015-06-05 Stage 2 (copied)
2015-06-06 Stage 2 (copied)
2015-06-07 Stage 3
此解决方案允许我查询项目的状态,例如2015-06-06,并看到它仍处于第2阶段。
有什么方法可以使用mySql来提取相同的数据,但作为输出查询?我听说有人建议使用日历表,但我不确定这是怎么回事。我也看到有人推荐过Cross Join,但我再也不能从描述中理解它是如何工作的。
提前感谢您的帮助!
答案 0 :(得分:1)
<强>计划强>
- 通过在日历期间交叉加入数字和date_add来创建日历表。
- 将您的数据加入日历来源,日期为&lt; = calendar date
- 取最大日期&lt; =日历日期
- 加入原始数据源以获取状态
<强>设置强>
drop table if exists calendar_t;
CREATE TABLE calendar_t (
id integer primary key auto_increment not null,
`date` date not null,
day varchar(9) not null,
month varchar(13) not null,
`year` integer not null
);
drop view if exists digits_v;
create view digits_v
as
select 0 as n
union all
select 1
union all
select 2
union all
select 3
union all
select 4
union all
select 5
union all
select 6
union all
select 7
union all
select 8
union all
select 9
;
insert into calendar_t
( `date`, day, month, `year` )
select
date_add('2015-01-01', interval 100*a2.n + 10*a1.n + a0.n day) as `date`,
dayname(date_add('2015-01-01', interval 100*a2.n + 10*a1.n + a0.n day)) as day,
monthname(date_add('2015-01-01', interval 100*a2.n + 10*a1.n + a0.n day)) as month,
year(date_add('2015-01-01', interval 100*a2.n + 10*a1.n + a0.n day)) as `year`
from
digits_v a2
cross join digits_v a1
cross join digits_v a0
order by date_add('2015-01-01', interval 100*a2.n + 10*a1.n + a0.n day)
;
drop table if exists example;
create table example
(
`date` date not null,
status varchar(23) not null
);
insert into example
( `date`, status )
values
( '2015-06-01', 'Start' ),
( '2015-06-03', 'Stage 2' ),
( '2015-06-07', 'Stage 3' )
;
<强>查询强>
select cal_date, mdate, ex2.status
from
(
select cal_date, max(ex_date) as mdate
from
(
select cal.`date` as cal_date, ex.`date` as ex_date
from calendar_t cal
inner join example ex
on ex.`date` <= cal.`date`
) maxs
group by cal_date
) m2
inner join example ex2
on m2.mdate = ex2.`date`
-- pick a reasonable end date for filtering..
where cal_date <= date('2015-06-15')
order by cal_date
;
<强>输出强>
+------------------------+------------------------+---------+
| cal_date | mdate | status |
+------------------------+------------------------+---------+
| June, 01 2015 00:00:00 | June, 01 2015 00:00:00 | Start |
| June, 02 2015 00:00:00 | June, 01 2015 00:00:00 | Start |
| June, 03 2015 00:00:00 | June, 03 2015 00:00:00 | Stage 2 |
| June, 04 2015 00:00:00 | June, 03 2015 00:00:00 | Stage 2 |
| June, 05 2015 00:00:00 | June, 03 2015 00:00:00 | Stage 2 |
| June, 06 2015 00:00:00 | June, 03 2015 00:00:00 | Stage 2 |
| June, 07 2015 00:00:00 | June, 07 2015 00:00:00 | Stage 3 |
| June, 08 2015 00:00:00 | June, 07 2015 00:00:00 | Stage 3 |
| June, 09 2015 00:00:00 | June, 07 2015 00:00:00 | Stage 3 |
| June, 10 2015 00:00:00 | June, 07 2015 00:00:00 | Stage 3 |
| June, 11 2015 00:00:00 | June, 07 2015 00:00:00 | Stage 3 |
| June, 12 2015 00:00:00 | June, 07 2015 00:00:00 | Stage 3 |
| June, 13 2015 00:00:00 | June, 07 2015 00:00:00 | Stage 3 |
| June, 14 2015 00:00:00 | June, 07 2015 00:00:00 | Stage 3 |
| June, 15 2015 00:00:00 | June, 07 2015 00:00:00 | Stage 3 |
+------------------------+------------------------+---------+
<强> sqlfiddle 强>
<强>参考强>
答案 1 :(得分:0)
您不需要创建包含所有日期的表格。您可以更改表格以给出每个状态的开始和结束日期,并使用之间的语句。
或使用您现有的数据。
使用@datequery作为您想要查找状态的日期。
Select top 1 Status from Events
where Date <= @datequery and Date
order by Date desc
在您查询之前返回最近的状态更改。
@datequery = 2015-06-06
Status
Stage 2