Question

我有一个SQL Server数据库，其中包含每位员工的轮班工作信息。主表（称为“shift_worked”）的结构如下：

id    employee_id  period    day   hours
1     154          6         5     4.5
2     156          7         12    7.25
3     154          7         6     8
4     154          7         7     6.75
5     142          7         7     5.5
6     156          8         12    7.1

我需要确定每个员工达到工作500小时阈值的时间段和日期......或者当然能够确定谁尚未达到此阈值。

我正在尝试查看递归查询来处理这个问题，但我无法解决这个问题。

*****编辑***** 我只在评论中提供了它，但数据库是SQL Server 2008 - 遗憾的是没有一个好的2012命令可以工作。

Answer 1

根据我的理解，我们的表格看起来像这样：

CREATE TABLE #data (id INT IDENTITY(1,1),
 employee_id  INT ,
 period    INT ,
 [day] INT,
 [hours] DECIMAL (8,3))

制作数据：

DECLARE @seed INT = 0, 
    @max INT = 10000,
    @employee INT

WHILE @seed < @max
BEGIN
    SET @employee =100 + RAND()*40 

    INSERT INTO #data
            ( employee_id, period, day, hours)
    VALUES  ( @employee, -- employee_id - int
              1 + RAND() * 26, -- period - int
              1 + RAND() * 14, -- day - int
              4 + RAND() * 8  )

    SET @seed = @seed + 1
END

使用Cross Apply计算每天+期间组合的当前总小时数（假设这些是连续的）。

SELECT da.employee_id, 
 MIN(da.period) AS [Period],
 -- Because getting min day gets the lowest day number of all periods
 MIN(da.period * 1000 + da.day) % 1000  AS [Day]

FROM #data da
CROSS APPLY (
    SELECT d.employee_id, SUM(d.hours) AS [Hours]
    FROM #data d
    WHERE d.employee_id = da.employee_id    
    --Total number of days since period 1 day 1
    AND d.day + d.period * 14  < da.day + da.period  * 14 
    GROUP BY d.employee_id) total 

WHERE total.Hours > 500
GROUP BY da.employee_id
ORDER BY da.employee_id

即使使用新计算的where子句，查询也需要1秒钟才能生成我生成的10k记录。您可以通过索引员工/日期/期间来获得绩效...我会运行分析器来确定该部分。

Answer 2

您好，您似乎在寻找累计总数。看看https://msdn.microsoft.com/en-us/library/ms189461.aspx。使用Max非常有用的发生器的示例： - 声明@data表（employee_id int，period int，day int，hours int）

DECLARE @seed INT = 0
WHILE @seed < 10000
begin
    INSERT INTO @data
            ( employee_id, period, day, hours )
    VALUES  ( 100 + RAND()*40 , -- employee_id - int
              1 + RAND() * 8, -- period - int
              1 + RAND() * 14, -- day - int
              4 + RAND() * 8 -- hours - decimal
              )

    SET @seed = @seed + 1
END

SELECT * FROM
(
select   employee_id,period,day, hours
         ,      CumulativeTotal
         , row_number() over (partition by employee_id order by  cumulativetotal) ROWNUMBER
from 
(
select   employee_id,period,day, hours
        ,SUM(hours) OVER (partition by employee_id
         ORDER BY period,day 
          ROWS UNBOUNDED PRECEDING) AS CumulativeTotal
from @data
--where employee_id = 100 
) s
where cumulativetotal >= 500
) T
WHERE T.ROWNUMBER = 1
order by T.employee_id ,T.period,T.day

/*Prove it by dropping into excel and adding a column in excel to confirm cumulative total*/
select employee_id ,period,day,hours
        ,SUM(hours) OVER (partition by employee_id
         ORDER BY period,day 
          ROWS UNBOUNDED PRECEDING) AS CumulativeTotal
from @data
where   employee_id = 101
order   by  employee_id,period,day

Answer 3

只要您至少拥有SQL Server 2012，那么窗口函数是您最好的选择。

with IsThresholdReached (employee_id, period, day, threshold_reached) 
as (
    select employee_id, period, day,
           case when 
              sum(hours) over (partition by employee_id order by period, day rows unbounded preceding) >= 500
           then 1 else 0 end
    from shift_worked
),

ThresholdFirstReached (employee_id, period, day, first_reached_period, first_reached_day)
as (
    select employee_id, period, day,
           first_value(period) over (partition by employee_id order by period, day rows unbounded preceding),
           first_value(day) over (partition by employee_id order by period, day rows unbounded preceding)
    from IsThresholdReached
    where threshold_reached = 1
)

select employee_id, period, day
from ThresholdFirstReached
where period = first_reached_period
and day = first_reached_day

解释：上面的第一个表达式通过跟踪工作时间的累计总和来计算给定员工在给定时间段和日期是否已超过阈值。第二个表达式确定发生这种情况的第一个时段和日期，最后一个表达式选择时段和日期等于这些值的实际行

SQL Server - 达到阈值达到阈值

3 个答案: