我有一张带有出勤日期的出勤日期表,empno timein / timeout列没有周末日期(星期五和星期六是周末)在Vertica
请注意: 1,该表是DateTimeAttendance 2,对于缺席的员工,Time-In为NULL。
所以我的情况是
TimeIn = NULL
我在数据上尝试了以下Vertica分析功能,并按如下方式获取了输出
select name, date, CONDITIONAL_TRUE_EVENT(date - lag(date)>1) over (partition by name order by date) as ConsecutiveDatesCounter
from DateTimeAttendance
where timein is null group by name,date ;
样品输出:
name date ConsecutiveDatesCounter
Aaron Gadsen 19/3/2014 0
Aaron Gadsen 23/3/2014 1
Aaron Gadsen 24/3/2014 1
Aaron Gadsen 25/3/2014 1
Aaron Gadsen 26/3/2014 1
Aaron Gadsen 27/3/2014 1
Aaron Gadsen 30/3/2014 2
Aaron Gadsen 31/3/2014 2
2014年3月28日和29/3/2014是周末,所以我希望 ConsecutiveDatesCounter 1不应该更改为2,它应该保持为1
我想得到如下输出
name date ConsecutiveDatesCounter
Aaron Gadsen 19/3/2014 0
Aaron Gadsen 23/3/2014 1
Aaron Gadsen 24/3/2014 1
Aaron Gadsen 25/3/2014 1
Aaron Gadsen 26/3/2014 1
Aaron Gadsen 27/3/2014 1
Aaron Gadsen 30/3/2014 1
Aaron Gadsen 31/3/2014 1
上述结果的下一个查询如下
select name, count(1) num_days, min(date) startdate, max(date) enddate
from (select name, date, CONDITIONAL_TRUE_EVENT(date - lag(date)>1) over (partition by name order by date) as ConsecutiveDatesCounter
from DateTimeAttendance where timein is null group by name,date ) as consecutive
group by name, consecutiveDatesCounter order by startdate;
最终输出应该是这样的:
name num_days startdate enddate
Aaron Gadsen 1 19/3/2014 19/3/2014
Aaron Gadsen 7 23/3/2014 31/3/2014
请帮助我在这种情况下解决这个周末问题..
先谢谢
答案 0 :(得分:0)
看起来你真的需要考虑周五到周一。我假设根据你的数据集你根本没有周末行(如果你这样做,你会想要过滤它们以便这个工作)。使该事件在周一/周五场景中变为错误(同时确保它仅为3天的差距)。它不漂亮,也许别人有更好的方式。
CONDITIONAL_TRUE_EVENT( (date - lag(date) > 1) AND NOT ( date - lag(date) = 3 AND DAYOFWEEK(date) = 2 AND DAYOFWEEK(lag(date)) = 6 )