Question

我正在将一些数据从远程API提取到本地SQL Server表中，其格式如下。（想象一下，它是按StatusDT降序排列的）

DriverID     StatusDT                 Status
--------     --------                 ------
b103         2019-03-05 05:42:52:000  D
b103         2019-03-03 23:45:42.000  SB
b103         2019-03-03 21:49:41.000  ON

最终到达我可以返回查询以显示每个驾驶员每天在每种状态下花费的总时间的最佳方式是什么？

此外，状态更新之间可能会有整天或更长时间的间隔，在这种情况下，我需要一行显示从00:00:00到23:59:59的先前状态的延续每跳过一天。因此，如果我遍历该表以使用下面的结构填充另一个表，则上面的示例将需要像这样...（再次，按日期降序排列）

DriverID  StartDT              EndDT               Status
--------  ---------------      --------------      ------
b103      2019-03-05 05:42:52                      D
b103      2019-03-05 00:00:00  2019-03-05 05:42:51 SB
b103      2019-03-04 00:00:00  2019-03-04 23:59:59 SB
b103      2019-03-03 23:45:42  2019-03-03 23:59:59 SB
b103      2019-03-03 21:49:41  2019-03-03 23:45:41 ON

这有意义吗？

我结束了将API数据转储到“工作”表并在其上运行游标以将行添加到另一个表（带有开始和结束日期/时间）的工作，但是我很好奇是否还有另一种方法更有效率。

非常感谢。

Answer 1

我认为此查询正是您所需要的。但是，我无法测试它的语法错误：

with x as (
  select
    DriverID,
    StatusDT as StartDT,
    lead(StatusID) over(partition by DriverID order by StatusDT) as EndDT,
    Status
  from my_table
)
select -- start & end on the same day
  DriverID,
  StartDT,
  EndDT,
  Status
from x
where convert(date, StartDT) = convert(date, EndDT) 
   or EndDT is null
union all
select -- start & end on different days; first day up to midnight
  DriverID,
  StartDT,
  dateadd(ms, -3, convert(date, EndDT)) as EndDT,
  Status
from x
where convert(date, StartDT) <> convert(date, EndDT)
  and or EndDT is not null
union all
select -- start & end on different days; next day from midnight
  DriverID,
  convert(date, EndDT) as StartDT,
  EndDT,
  Status
from x
where convert(date, StartDT) <> convert(date, EndDT)
  and or EndDT is not null
order by StartDT desc

Answer 2

您的大部分答案只是使用lead()：

select driverid, status, statusdt,
       lead(statusdt) over (partition by driverid order by statusdt) as enddte
from t;

这不会按天休息。但是您可以添加这些。我认为最简单的方法是添加日期（使用递归CTE）并计算当时的状态。所以：

我将执行以下操作：

使用递归CTE计算日期
“填写”状态并合并到原始表
使用lead()获取结束日期

这看起来像：

with day_boundaries as (
      select driverid, dateadd(day, 1, convert(min(statusdt) as date) as statusdt, max(statusdt) as finaldt
      from t 
      group by driverid
      having datediff(da, min(statusdt), max(statusdt)) > 0
      union all
      select driverid, dateadd(day, 1, statusdt), finaldt
      from day_boundaries
      where statusdt < finaldt
     ),
     unioned as (
      select driverid, status, statusdt
      from t
      union all
      select db.driverid, s.status, db.statusdt
      from day_boundaries db cross apply
           (select top (1) status
            from t
            where t.statusdt < db.statusdt
            order by t.statusdt desc
           ) s
     )
select driverid, status, statusdt,
           lead(statusdt) over (partition by driverid order by statusdt) as enddte
from unioned;

请注意，这不会从结束日期中减去任何秒数。结束日期与上一个开始日期匹配。时间是连续的。对于应该紧密地契合在一起的记录，存在空白是没有意义的。

根据状态更改的日期时间行创建状态日志

2 个答案: