计算MySQL / PHP中的重叠持续时间

时间:2013-09-02 19:06:20

标签: php mysql

这让我头疼! :P

我有一张assignments表,我想根据他们的作业计算成员的持续时间。简化形式,这将是相对简单的。

-------------------------------------------------------------------------
| id    | member_id | unit_id   | start_date    | end_date  |
-------------------------------------------------------------------------
| 1 | 2     | 23        | 2013-01-01    | 2013-02-01    |
-------------------------------------------------------------------------
| 2 | 2     | 25        | 2013-02-01    | 2013-03-01    |
-------------------------------------------------------------------------
| 3 | 2     | 27        | 2013-03-01    | NULL      |
-------------------------------------------------------------------------

这只是在SUM()DATEDIFF()start_date进行end_date的问题。问题是成员有可能同时进行任务。

-------------------------------------------------------------------------
| id    | member_id | unit_id   | start_date    | end_date  |
-------------------------------------------------------------------------
| 1 | 2     | 23        | 2013-01-01    | 2013-02-01    |
-------------------------------------------------------------------------
| 2 | 2     | 25        | 2013-02-01    | 2013-03-01    |
-------------------------------------------------------------------------
| 3 | 2     | 30        | 2013-02-15    | 2013-03-01    |*
-------------------------------------------------------------------------
| 4 | 2     | 27        | 2013-03-01    | NULL      |
-------------------------------------------------------------------------

现在我必须知道#3与#2同时发生,所以我不应该将它添加到SUM()

更进一步,如果成员的持续时间有差距怎么办?

-------------------------------------------------------------------------
| id    | member_id | unit_id   | start_date    | end_date  |
-------------------------------------------------------------------------
| 1 | 2     | 23        | 2013-01-01    | 2013-02-01    |
-------------------------------------------------------------------------
| 2 | 2     | 25        | 2013-02-01    | 2013-02-05    |*
-------------------------------------------------------------------------
| 3 | 2     | 30        | 2013-02-15    | 2013-03-01    |*
-------------------------------------------------------------------------
| 4 | 2     | 27        | 2013-03-01    | NULL      |
-------------------------------------------------------------------------

此外,NULL表示“当前”,因此CURDATE()

有什么想法吗?

2 个答案:

答案 0 :(得分:1)

这是个主意。将每条记录分成两部分,以获得分配开始和结束时的日期列表。然后确定在给定日期有多少分配是活动的 - 基本上每个开始添加“1”,每个末端添加“-1”并获取累积总和。

接下来,您需要确定下一个日期何时在进行最终聚合之前获取句点。

第一部分由此查询处理:

select member_id, thedate,
       @sumstart := if(@prevmemberid = memberid, @sumstart + isstart, isstart) as sumstart,
       @prevmemberid := memberid
from (select member_id, start_date as thedate, 1 as isstart
      from assignments
      union all
      select member_id, end_date, -1 as isstart
      from assignments
      order by member_id, thedate
     ) a cross join
     (select @sumstart := 0, @prevmemberid := NULL) const;

其余的则使用更多变量:

select member_id,
       sum(case when sumstart > 0 then datediff(nextdate, thedate) end) as daysactive
from (select member_id, thedate, sumstart,
         if(@prevmemberid = memberid, @nextdate, NULL) as nextdate,
         @prevmemberid := memberid,
         @nextdate = thedate
      from (select member_id, thedate,
                   @sumstart := if(@prevmemberid = memberid, @sumstart + isstart, isstart) as sumstart,
                   @prevmemberid := memberid
            from (select member_id, start_date as thedate, 1 as isstart
                  from assignments
                  union all
                  select member_id, coalesce(end_date, CURDATE()), -1 as isstart
                  from assignments
                  order by member_id, thedate
                 ) a cross join
                 (select @sumstart := 0, @prevmemberid := NULL) const;
           ) a cross join
           (select @nextmemberid := NULL, @nextdate := NULL) const
       order by member_id, thedate desc;
      ) a
group by member_id;

我不喜欢以这种方式使用变量,因为MySQL不保证给定select中变量赋值的排序。但实际上,它们是按照写入的顺序(这个查询所依赖的)进行评估的。虽然这可以在没有变量的情况下编写,但没有with语句,窗口函数,甚至是在from子句中进行子查询的视图,结果SQL将是很多丑陋。

答案 1 :(得分:0)

我认为在代码中而不是在SQL中执行过滤掉重叠分配更容易。 您可以检索由start_date命令的某个member_id的所有分配:

select * from assignments where member_id='2' order by start_date asc

然后,您可以循环这些分配并过滤掉重叠的分配。 如果A在B开始之前结束或者如果B在A开始之前结束,则两个分配A和B不重叠。

因为我们根据开始日期对结果进行了排序,所以我们可以放心地忽略第二种情况:B永远不会在A之前开始,因此它不能在A开始之前结束。 然后我们得到类似的东西:

for i=0..assignments.length
    for j=i+1..assignments.length
        if (assignments[j].start_date < assignments[i].end_date)
            assignments[j] = null; // it overlaps -> get rid of it

然后循环分配并总结非空分配的持续时间。这应该很容易