Question

我正在处理时间表数据库。简单来说，TimesheetEntries表有四列

ID int (identity, 1, 1)
StaffID int
ClockedIn datetime
ClockedOut datetime

我被要求写一份报告，以按日期范围显示员工出勤率。用户输入日期，报告输出所有参加工作人员的时间进出时间以及他们在现场的持续时间。

但是，这是一个棘手的问题，工作人员有时会离开网站短时间，报告要求忽略这些（当他们离开网站的时间不到2小时）。

所以，我们假设以下条目

ID  StaffID  ClockedIn    ClockedOut
1   4        0900         1200
2   4        1330         1730
3   5        0900         1200
4   5        1409         1730
5   4        1830         1930

报告的输出应该

StaffID  ClockedIn    ClockedOut
4        0900         1930
5        0900         1200     
5        1409         1730

有没有任何方法可以在没有光标或甚至光标嵌套在光标内的情况下这样做（这就是我现在所处的位置！）？我们不是在谈论大数据集，性能不是真正的问题（它是一个报告，而不是一个生产系统）但我真的不喜欢游标，如果我可以避免它们。

由于

爱德华

Answer 1

我确信没有那么复杂的方法可以做到这一点，但是我能够用几个CTE来实现它：

declare @TimeSheetEntries table
    (
    ID int identity not null primary key,
    StaffID int not null,
    ClockedIn datetime not null,
    ClockedOut datetime not null
    );

insert into @TimeSheetEntries
    (
    StaffID,
    ClockedIn,
    ClockedOut
    )
select
    4,
    '2012-01-01 09:00:00',
    '2012-01-01 12:00:00'
union all select
    4,
    '2012-01-01 13:30:00',
    '2012-01-01 17:30:00'
union all select
    5,
    '2012-01-01 09:00:00',
    '2012-01-01 12:00:00'
union all select
    5,
    '2012-01-01 14:09:00',
    '2012-01-01 17:30:00'
union all select 
    4, 
    '2012-01-01 18:30:00', 
    '2012-01-01 19:30:00'       
;
with MultiCheckins as (
    select distinct
        StaffID,
        cast(cast(cast(ClockedIn as float) as int) as datetime) as TimeSheetDate,
        rank() over (
            partition by StaffID, 
            cast(cast(cast(ClockedIn as float) as int) as datetime)
            order by ClockedIn
            ) as ordinal,
        ClockedIn,
        ClockedOut
    from @TimeSheetEntries
), Organized as
(
select
    row_number() over (
        order by
            mc.StaffID,
            mc.TimeSheetDate,
            mc.ClockedIn,
            mc.ClockedOut
            ) as RowID,
    mc.StaffID,
    mc.TimeSheetDate,
    case
        when datediff(hour, coalesce(mc3.ClockedOut, mc.ClockedIn), mc.ClockedIn) >= 2
            then mc.ClockedIn 
        else coalesce(mc3.ClockedIn, mc.ClockedIn)
        end as ClockedIn,
    case 
        when datediff(hour, mc.ClockedOut, coalesce(mc2.ClockedIn, mc.ClockedOut)) < 2
            then coalesce(mc2.ClockedOut, mc.ClockedOut)
        else mc.ClockedOut
        end as ClockedOut
from
    MultiCheckins as mc
left outer join
    MultiCheckIns as mc3
        on mc3.StaffID = mc.StaffID
        and mc3.TimeSheetDate = mc.TimeSheetDate
        and mc3.ordinal =  mc.ordinal - 1
left outer join 
    MultiCheckIns as mc2
        on mc2.StaffID = mc.StaffID
        and mc2.TimeSheetDate = mc.TimeSheetDate
        and mc2.ordinal = mc.ordinal + 1
)
select distinct
    o.StaffID,
    o.ClockedIn,
    o.ClockedOut
from Organized as o
where
    not exists (
        select null from Organized as o2
        where o2.RowID <> o.RowID
        and o2.StaffID = o.StaffID
        and 
            (
            o.ClockedIn between o2.ClockedIn and o2.ClockedOut
            and o.ClockedOut between o2.ClockedIn and o2.ClockedOut
            )
        )

Answer 2

我使用了Jeremy上面回答的数据，但却以完全不同的方式解决了问题。这使用递归CTE，我认为它需要SQL Server 2005.它准确地报告结果（我相信）并且还报告在时间范围内记录的时钟输入的数量以及关闭的总分钟数（可能超过120因为限制只是每个场外时间不到两小时）。

declare @TimeSheetEntries table 
    ( 
    ID int identity not null primary key, 
    StaffID int not null, 
    ClockedIn datetime not null, 
    ClockedOut datetime not null 
    ); 

insert into @TimeSheetEntries 
    ( 
    StaffID, 
    ClockedIn, 
    ClockedOut 
    ) 
select 
    4, 
    '2012-01-01 09:00:00', 
    '2012-01-01 12:00:00' 
union all select 
    4, 
    '2012-01-01 13:30:00', 
    '2012-01-01 17:30:00' 
union all select 
    5, 
    '2012-01-01 09:00:00', 
    '2012-01-01 12:00:00' 
union all select 
    5, 
    '2012-01-01 14:09:00', 
    '2012-01-01 17:30:00'
union all select
    4,
    '2012-01-01 18:30:00', 
    '2012-01-01 19:30:00';


WITH ClockData AS
(
    SELECT ID, StaffID, ClockedIn, ClockedOut AS EffectiveClockout, 1 AS NumClockIns, 0 AS MinutesOff
    FROM @TimeSheetEntries ts
    WHERE NOT EXISTS (SELECT ID FROM @TimeSheetEntries tsWhere WHERE tsWhere.ClockedOut BETWEEN DATEADD(hour, -2, ts.ClockedIn) AND ts.ClockedIn)

    UNION ALL

    SELECT cd.ID, cd.StaffID, cd.ClockedIn, ts.ClockedOut AS EffectiveClockout, cd.NumClockIns + 1 AS NumClockIns, cd.MinutesOff + DateDiff(minute, cd.EffectiveClockout, ts.ClockedIn) AS MinutesOff
    FROM @TimeSheetEntries ts
    INNER JOIN ClockData cd
        ON ts.StaffID = cd.StaffID
            AND ts.ClockedIn BETWEEN cd.EffectiveClockout AND dateadd(hour, 2, cd.EffectiveClockout)
)
SELECT *
FROM ClockData cd
WHERE NumClockIns = (SELECT MAX(NumClockIns) FROM ClockData WHERE ID = cd.ID)

返回：

ID   StaffID   ClockedIn                 EffectiveClockout        NumClockIns   MinutesOff
3    5         2012-01-01 09:00:00.000   2012-01-01 12:00:00.000  1             0
4    5         2012-01-01 14:09:00.000   2012-01-01 17:30:00.000  1             0
1    4         2012-01-01 09:00:00.000   2012-01-01 19:30:00.000  3             150

<强>更新

如果不清楚，MinutesOff只是'允许'时间，或同一行中显示的ClockedIn和EffectiveClockout之间'吃'的时间。因此，StaffID 5在时钟周期之间花费了129分钟，但没有允许时间，因此两行的MinutesOff为0。

Answer 3

选项1：也许将它插入到临时表中，然后使用左连接来构建结果表（如果它们只能在白天进出两次，如果你有3个结果就不会有效）

select *
from timesheet ts
left join timesheet tss on tss.id = ts.id

在此之后，您可以获得最小值和最大值，甚至可以获得更强大的报告。

选项2：

create #TimeTable Table (UserID int, InTime int, OutTime int)

insert into #TimeTable (UserID) select distinct StaffID

Update #TimeTable set InTime = (select Min(InTime) from #TimeTable where StaffID = s.StaffID)  from #TimeTAble s

Update #TimeTable set OutTime = (Select Max(OutTime) from #TimeTable where StaffID = s.StaffID) from #TimeTable s

考虑到mroe时间，我将这些合并为两个快速查询，但三个可以不用担心性能。

Answer 4

基于迭代集的方法：

-- Sample data.
declare @TimesheetEntries as Table ( Id Int Identity, StaffId Int, ClockIn DateTime, ClockOut DateTime )
insert into @TimesheetEntries ( StaffId, ClockIn, ClockOut ) values
  ( 4, '2012-05-03 09:00', '2012-05-03 12:00' ),
  ( 4, '2012-05-03 13:30', '2012-05-03 17:30' ), -- This falls within 2 hours of the next two rows.
  ( 4, '2012-05-03 17:35', '2012-05-03 18:00' ),
  ( 4, '2012-05-03 19:00', '2012-05-03 19:30' ),
  ( 4, '2012-05-03 19:45', '2012-05-03 20:00' ),
  ( 5, '2012-05-03 09:00', '2012-05-03 12:00' ),
  ( 5, '2012-05-03 14:09', '2012-05-03 17:30' ),
  ( 6, '2012-05-03 09:00', '2012-05-03 12:00' ),
  ( 6, '2012-05-03 13:00', '2012-05-03 17:00' )
select Id, StaffId, ClockIn, ClockOut from @TimesheetEntries

-- Find all of the periods that need to be coalesced and start the process.
declare @Bar as Table ( Id Int Identity, StaffId Int, ClockIn DateTime, ClockOut DateTime )
insert into @Bar
  select TSl.StaffId, TSl.ClockIn, TSr.ClockOut
    from @TimesheetEntries as TSl inner join
      -- The same staff member and the end of the left period is within two hours of the start of the right period.
      @TimesheetEntries as TSr on TSr.StaffId = TSl.StaffId and DateDiff( ss, TSl.ClockOut, TSr.ClockIn ) between 0 and 7200

-- Continue coalescing periods until we run out of work.
declare @Changed as Bit = 1
while @Changed = 1
  begin
  set @Changed = 0
  -- Coalesce periods.
  update Bl
    -- Take the later   ClockOut   time from the two rows.
    set ClockOut = case when Br.ClockOut >= Bl.ClockOut then Br.ClockOut else Bl.ClockOut end
    from @Bar as Bl inner join
      @Bar as Br on Br.StaffId = Bl.StaffId and
        -- The left row started before the right and either the right period is completely contained in the left or the right period starts within two hours of the end of the left.
        Bl.ClockIn < Br.ClockIn and ( Br.ClockOut <= Bl.ClockOut or DateDiff( ss, Bl.ClockOut, Br.ClockIn ) < 7200 )
  if @@RowCount > 0
    set @Changed = 1
  -- Delete rows where one period is completely contained in another.
  delete Br
    from @Bar as Bl inner join
      @Bar as Br on Br.StaffId = Bl.StaffId and
        ( ( Bl.ClockIn < Br.ClockIn and Br.ClockOut <= Bl.ClockOut ) or ( Bl.ClockIn <= Br.ClockIn and Br.ClockOut < Bl.ClockOut ) )
  if @@RowCount > 0
    set @Changed = 1
  end

-- Return all of the coalesced periods ...
select StaffId, ClockIn, ClockOut, 'Coalesced Periods' as [Type]
  from @Bar
union all
-- ... and all of the independent periods.
select StaffId, ClockIn, ClockOut, 'Independent Period'
  from @TimesheetEntries as TS
  where not exists ( select 42 from @Bar where StaffId = TS.StaffId and ClockIn <= TS.ClockIn and TS.ClockOut <= ClockOut )
order by ClockIn, StaffId

我确信应该进行一些优化。

Answer 5

我认为你可以很容易地做到这一点，只需要一个左连接回到自己和一次性比赛。以下不是完整的实现，而是更多的概念证明：

create table #TimeSheetEntries 
    ( 
    ID int identity not null primary key, 
    StaffID int not null, 
    ClockedIn datetime not null, 
    ClockedOut datetime not null 
    ); 

insert into #TimeSheetEntries 
    ( 
    StaffID, 
    ClockedIn, 
    ClockedOut 
    ) 
select 
    4, 
    '2012-01-01 09:00:00', 
    '2012-01-01 12:00:00' 
union all select 
    4, 
    '2012-01-01 13:30:00', 
    '2012-01-01 17:30:00' 
union all select 
    5, 
    '2012-01-01 09:00:00', 
    '2012-01-01 12:00:00' 
union all select 
    5, 
    '2012-01-01 14:09:00', 
    '2012-01-01 17:30:00'
union all select
    4,
    '2012-01-01 18:30:00', 
    '2012-01-01 19:30:00'
union all select 4, '2012-01-01 18:30:00', '2012-01-01 19:30:00';


select * from #timesheetentries tse1
left outer join #timesheetentries tse2 on tse1.staffid = tse2.staffid 
  and tse2.id = 
  (
      select MAX(ID) 
      from #timesheetentries ts_max 
      where ts_max.id < tse1.id and tse1.staffid = ts_max.staffid
  )
  outer apply   
  (
  select DATEDIFF(minute, tse2.clockedout, tse1.clockedin) as BreakTime
  ) as breakCheck

where BreakTime > 120 or BreakTime < 0 or tse2.id is null

order by tse1.StaffID, tse1.ClockedIn


   GO
   drop table #timesheetentries
   GO

这里的想法是您拥有原始时间表tse1，然后对同一时间表表格执行left join，别名为tse2，并在{{1}时匹配行}}是相同的，staffID是仍然小于tse2.ID的最高ID值。这显然是糟糕的形式 - 您可能希望使用tse1.ID进行此ID比较，按ROW_NUMBER()分区和排序以及StaffID / ClockedIn值，因为时间可能已按时间顺序输入。

此时，连接表中的一行现在包含当前时间表条目的时间数据，以及之前的时间数据。这意味着我们可以对连续时间条目的ClockedOut / ClockedIn值进行比较...并使用ClockedOut，我们可以找出用户离开的时间长度在他们之前的DATEDIFF()和更近的Clockedout值之间。我之所以使用ClockedIn只是因为它使代码更清晰，但你可以把它打包成子查询。

一旦我们执行了OUTER APPLY，找到个人的DATEDIFF()不超过120分钟障碍并删除这些时间表条目的情况是微不足道的，只留下员工时间表的重要行用于以后的报告。

我可以在没有游标的SQL函数中执行此操作吗？

5 个答案: