我有一个SQL Server数据库,其中包含每位员工的轮班工作信息。主表(称为“shift_worked”)的结构如下:
id employee_id period day hours
1 154 6 5 4.5
2 156 7 12 7.25
3 154 7 6 8
4 154 7 7 6.75
5 142 7 7 5.5
6 156 8 12 7.1
我需要确定每个员工达到工作500小时阈值的时间段和日期......或者当然能够确定谁尚未达到此阈值。
我正在尝试查看递归查询来处理这个问题,但我无法解决这个问题。
*****编辑***** 我只在评论中提供了它,但数据库是SQL Server 2008 - 遗憾的是没有一个好的2012命令可以工作。
答案 0 :(得分:0)
根据我的理解,我们的表格看起来像这样:
CREATE TABLE #data (id INT IDENTITY(1,1),
employee_id INT ,
period INT ,
[day] INT,
[hours] DECIMAL (8,3))
制作数据:
DECLARE @seed INT = 0,
@max INT = 10000,
@employee INT
WHILE @seed < @max
BEGIN
SET @employee =100 + RAND()*40
INSERT INTO #data
( employee_id, period, day, hours)
VALUES ( @employee, -- employee_id - int
1 + RAND() * 26, -- period - int
1 + RAND() * 14, -- day - int
4 + RAND() * 8 )
SET @seed = @seed + 1
END
使用Cross Apply
计算每天+期间组合的当前总小时数(假设这些是连续的)。
SELECT da.employee_id,
MIN(da.period) AS [Period],
-- Because getting min day gets the lowest day number of all periods
MIN(da.period * 1000 + da.day) % 1000 AS [Day]
FROM #data da
CROSS APPLY (
SELECT d.employee_id, SUM(d.hours) AS [Hours]
FROM #data d
WHERE d.employee_id = da.employee_id
--Total number of days since period 1 day 1
AND d.day + d.period * 14 < da.day + da.period * 14
GROUP BY d.employee_id) total
WHERE total.Hours > 500
GROUP BY da.employee_id
ORDER BY da.employee_id
即使使用新计算的where子句,查询也需要1秒钟才能生成我生成的10k记录。您可以通过索引员工/日期/期间来获得绩效...我会运行分析器来确定该部分。
答案 1 :(得分:0)
您好,您似乎在寻找累计总数。看看https://msdn.microsoft.com/en-us/library/ms189461.aspx。 使用Max非常有用的发生器的示例: - 声明@data表(employee_id int,period int,day int,hours int)
DECLARE @seed INT = 0
WHILE @seed < 10000
begin
INSERT INTO @data
( employee_id, period, day, hours )
VALUES ( 100 + RAND()*40 , -- employee_id - int
1 + RAND() * 8, -- period - int
1 + RAND() * 14, -- day - int
4 + RAND() * 8 -- hours - decimal
)
SET @seed = @seed + 1
END
SELECT * FROM
(
select employee_id,period,day, hours
, CumulativeTotal
, row_number() over (partition by employee_id order by cumulativetotal) ROWNUMBER
from
(
select employee_id,period,day, hours
,SUM(hours) OVER (partition by employee_id
ORDER BY period,day
ROWS UNBOUNDED PRECEDING) AS CumulativeTotal
from @data
--where employee_id = 100
) s
where cumulativetotal >= 500
) T
WHERE T.ROWNUMBER = 1
order by T.employee_id ,T.period,T.day
/*Prove it by dropping into excel and adding a column in excel to confirm cumulative total*/
select employee_id ,period,day,hours
,SUM(hours) OVER (partition by employee_id
ORDER BY period,day
ROWS UNBOUNDED PRECEDING) AS CumulativeTotal
from @data
where employee_id = 101
order by employee_id,period,day
答案 2 :(得分:0)
只要您至少拥有SQL Server 2012,那么窗口函数是您最好的选择。
with IsThresholdReached (employee_id, period, day, threshold_reached)
as (
select employee_id, period, day,
case when
sum(hours) over (partition by employee_id order by period, day rows unbounded preceding) >= 500
then 1 else 0 end
from shift_worked
),
ThresholdFirstReached (employee_id, period, day, first_reached_period, first_reached_day)
as (
select employee_id, period, day,
first_value(period) over (partition by employee_id order by period, day rows unbounded preceding),
first_value(day) over (partition by employee_id order by period, day rows unbounded preceding)
from IsThresholdReached
where threshold_reached = 1
)
select employee_id, period, day
from ThresholdFirstReached
where period = first_reached_period
and day = first_reached_day
解释:上面的第一个表达式通过跟踪工作时间的累计总和来计算给定员工在给定时间段和日期是否已超过阈值。第二个表达式确定发生这种情况的第一个时段和日期,最后一个表达式选择时段和日期等于这些值的实际行