我有一个每天要查询的员工时钟数据库表。但是,当员工的工作班次跨入午夜时,查询将丢失确定该员工总时数所需的交易。
ID NUMBER (PK)
EmpId VARCHAR
TransActType VARCHAR
Start TIMESTAMP
Account VARCHAR
1 EmpA ClockIn 7/7/20 8am Account1
2 EmpB ClockIn 7/7/20 9am Account7
3 EmpC ClockIn 7/7/20 9am Account1
4 EmpA Switch 7/7/20 10am Account3
5 EmpA Switch 7/7/20 11am Account6
6 EmpC Switch 7/7/20 1pm Account4
7 EmpC ClockOut 7/7/20 3pm
8 EmpD ClockIn 7/7/20 5pm Account5
9 EmpD Switch 7/7/20 6pm Account6
10 EmpB ClockOut 7/7/20 6pm
11 EmpD Switch 7/7/20 7pm Account4
12 EmpA ClockOut 7/7/20 8pm
13 EmpD Switch 7/8/20 1am Account3
14 EmpD ClockOut 7/8/20 2am
15 EmpA ClockIn 7/8/20 8am Account1
...
我要查询20/7/7的劳动力是
SELECT * FROM labor li where li.start between 7/7/20 12 am and 7/7/20 11:59 pm order by empId, start
并且只会抓取记录1-12,但也应该抓取13和14。
我的应用程序代码通过循环查询结果并针对员工连续事务的开始时间来计算每个员工当天的帐户工期。
没有记录13和14,我无法确定员工D的account4和account3持续时间。
也许我的数据库设计有缺陷,因为我只存储事务的开始时间,然后使用应用程序代码逻辑来计算持续时间。我决定以这种方式进行设计,这样,如果记录被移动/插入/删除,那么如果也保留结束时间,则个人保留的员工人工项目重叠的机会就较小。上面的人工数据样本显示,员工可以全天更改帐户并每天进行多次人工交易。
我希望查询一个可以按每个员工及时查询的查询,如果该员工的最后一笔交易不是时间戳范围内的“ ClockOut”,请继续抓取记录,直到找到一个为止。
基于同样的理由,我不能让该员工的第一笔交易是前一天的ClockOut。
如果这些查询几乎是不可能的,并且应该更改数据库设计,那么我想知道这一点。
答案 0 :(得分:1)
I am hoping for a query that could look ahead in time on a per employee basis and if the last transaction of that employee is not a "ClockOut" for the timestamp range, keep grabbing records until one is found.
with
dates(d1, d2) as (select date '2020-07-07', date '2020-07-08' from dual),
main as (select id, empid, transacttype, start_, account,
max(transacttype) keep (dense_rank last order by start_)
over (partition by empid) mtt
from li join dates on d1 <= start_ and start_ < d2),
miss as (select empid, max(id) mnid from main where mtt = 'ClockIn' group by empid),
cout as (select empid, min(id) mxid
from li join dates on start_ >= d2 join miss using (empid)
where transacttype = 'ClockOut'
group by empid)
select id, empid, transacttype, start_, account from main union all
select li.id, li.empid, li.transacttype, li.start_, li.account from li
join miss on li.empid = miss.empid and li.id > mnid
join cout on li.empid = cout.empid and li.id < mxid
order by empid, start_
此查询如您所说工作。我找到了那个时期的所有数据,还找到了转写型的最后一个值。如果不是ClockOut
,那么在接下来的步骤中,我将寻找min(id)
,以查找第二天失踪的员工。最后是主要数据的合并和丢失。
请小心,因为在您的示例中有时为Clockout
,有时为ClockOut
。您可以使用upper()
或其他在实际数据中发生的东西。
如果您不想依赖id
,而是依赖日期,则可以使用上述查询的 date 版本:
with
dates(d1, d2) as (select date '2020-07-07', date '2020-07-08' from dual),
main as (select id, empid, transacttype, start_, account,
max(transacttype) keep (dense_rank last order by start_)
over (partition by empid) mtt
from li join dates on d1 <= start_ and start_ < d2),
miss as (select empid, max(start_) mnst from main where mtt = 'ClockIn' group by empid),
cout as (select empid, min(start_) mxst
from li join dates on start_ >= d2 join miss using (empid)
where transacttype = 'ClockOut'
group by empid)
select id, empid, transacttype, start_, account from main union all
select li.id, li.empid, li.transacttype, li.start_, li.account from li
join dates on start_ >= d2
join cout on li.empid = cout.empid and li.start_ <= mxst
order by empid, start_
答案 1 :(得分:1)
如果您使用的是Oracle的最新版本,则可以使用match_recognize()
来跟踪时钟输入/切换/时钟输出,然后根据时钟输入时间进行过滤:
select id, empid, transacttype, start_time, account
from labor
match_recognize (
partition by empid
order by start_time
measures
first(start_time) as grp_clockin
all rows per match
after match skip past last row
pattern (clockin switch* clockout*)
define
clockin as clockin.transacttype = 'ClockIn',
switch as switch.transacttype = 'Switch',
clockout as clockout.transacttype = 'ClockOut'
)
where grp_clockin >= date '2020-07-07'
and grp_clockin < date '2020-07-08'
order by empid, grp_clockin, start_time;
ID | EMPID | TRANSACTTYPE | START_TIME | ACCOUNT
-: | :---- | :----------- | :------------------ | :-------
1 | EmpA | ClockIn | 2020-07-07 08:00:00 | Account1
4 | EmpA | Switch | 2020-07-07 10:00:00 | Account3
5 | EmpA | Switch | 2020-07-07 11:00:00 | Account6
12 | EmpA | ClockOut | 2020-07-07 20:00:00 | null
2 | EmpB | ClockIn | 2020-07-07 09:00:00 | Account7
10 | EmpB | ClockOut | 2020-07-07 18:00:00 | null
3 | EmpC | ClockIn | 2020-07-07 09:00:00 | Account1
6 | EmpC | Switch | 2020-07-07 13:00:00 | Account4
7 | EmpC | ClockOut | 2020-07-07 15:00:00 | null
8 | EmpD | ClockIn | 2020-07-07 17:00:00 | Account5
9 | EmpD | Switch | 2020-07-07 18:00:00 | Account6
11 | EmpD | Switch | 2020-07-07 19:00:00 | Account4
13 | EmpD | Switch | 2020-07-08 01:00:00 | Account3
14 | EmpD | ClockOut | 2020-07-08 02:00:00 | null
当过滤器发生得较晚时,您可以使用内联视图将其范围缩小,至少缩小到最小日期/时间:
select id, empid, transacttype, start_time, account
from (
select *
from labor
where start_time >= date '2020-07-07'
)
match_recognize (
partition by empid
order by start_time
measures
first(start_time) as grp_clockin
all rows per match
after match skip past last row
pattern (clockin switch* clockout*)
define
clockin as clockin.transacttype = 'ClockIn',
switch as switch.transacttype = 'Switch',
clockout as clockout.transacttype = 'ClockOut'
)
where grp_clockin >= date '2020-07-07'
and grp_clockin < date '2020-07-08'
order by empid, grp_clockin, start_time;
如果您能提出一个合理的选择,也可以选择最大范围,比如说一天以后排除任何东西:
select id, empid, transacttype, start_time, account
from (
select *
from labor
where start_time >= date '2020-07-07'
and start_time < date '2020-07-09'
)
...
我的应用程序代码通过遍历查询结果并针对连续的员工交易的开始时间来计算每位员工当天的帐户工期
如果我了解您在做什么,那么您也可以在查询中完成所有操作:
select empid, grp_start_time, grp_end_time, grp_account,
(grp_end_time - grp_start_time) * interval '1' day as elapsed
from labor
match_recognize (
partition by empid
order by start_time
measures
first(start_time) as grp_start_time,
first(account) as grp_account,
final last(start_time) as grp_end_time
one row per match
after match skip to last grp_end
pattern (grp_start grp_end)
define
grp_start as grp_start.transacttype in ('ClockIn', 'Switch'),
grp_end as grp_end.transacttype in ('Switch', 'ClockOut')
)
where grp_start_time >= date '2020-07-07'
and grp_start_time < date '2020-07-08'
order by empid, grp_start_time;
EMPID | GRP_START_TIME | GRP_END_TIME | GRP_ACCOUNT | ELAPSED
:---- | :------------------ | :------------------ | :---------- | :----------------------------
EmpA | 2020-07-07 08:00:00 | 2020-07-07 10:00:00 | Account1 | +000000000 02:00:00.000000000
EmpA | 2020-07-07 10:00:00 | 2020-07-07 11:00:00 | Account3 | +000000000 01:00:00.000000000
EmpA | 2020-07-07 11:00:00 | 2020-07-07 20:00:00 | Account6 | +000000000 09:00:00.000000000
EmpB | 2020-07-07 09:00:00 | 2020-07-07 18:00:00 | Account7 | +000000000 09:00:00.000000000
EmpC | 2020-07-07 09:00:00 | 2020-07-07 13:00:00 | Account1 | +000000000 04:00:00.000000000
EmpC | 2020-07-07 13:00:00 | 2020-07-07 15:00:00 | Account4 | +000000000 02:00:00.000000000
EmpD | 2020-07-07 17:00:00 | 2020-07-07 18:00:00 | Account5 | +000000000 01:00:00.000000000
EmpD | 2020-07-07 18:00:00 | 2020-07-07 19:00:00 | Account6 | +000000000 01:00:00.000000000
EmpD | 2020-07-07 19:00:00 | 2020-07-08 01:00:00 | Account4 | +000000000 06:00:00.000000000
答案 2 :(得分:0)
这听起来像是要抵消日期,所以开始时间是从凌晨2:00到凌晨8:00。无法确切说出位置,所以假设是凌晨5:00。
只需减去5个小时即可进行日期比较。要么
where li.start - interval '5' hour >= date '2020-07-07' and
li.start - interval '5' hour < date '2020-07-08'
或者,以更加索引友好的方式:
where li.start >= date '2020-07-07' + interval '5' hour
li.start < date '2020-07-08' + interval '5' hour
答案 3 :(得分:0)
我认为这可以满足您的需求:sqlfiddle
我不确定这是最佳解决方案,但也许您可以使用它来构建您的实际解决方案。如果考虑到评论中对需求的澄清,“ Ponder Stibbons”答案可能会比这更好(“所以如果今天是7/8/20,我不希望接13和14”)。
"Files": [
{
"pattern": "/Something/Something/1.dll"
},
{
"pattern": "/Something/Something/2.dll"
},
.
.
.
{
"pattern": "/Something/Something/6.dll"
}
]
}
答案 4 :(得分:0)
您可以为此使用简单的分析功能“ FIRST_VALUE”:
select li.*
, first_value(decode(TransActType,'ClockOut',"START") ignore nulls)
over(partition by EmpId order by "START" rows between current row and unbounded following) ClockOut
from labor li;
您会看到它添加了新列ClockOut,其中包含下一个ClockOut的时间,因此您可以轻松地将所需数据添加到查询中。
完整测试用例:
-- sample data:
with labor(ID,EmpId,TransActType,"START",Account) as (
select 1 ,'EmpA', 'ClockIn ', to_timestamp('7/7/20 8am ','mm/dd/yy hhAM'),'Account1' from dual union all
select 2 ,'EmpB', 'ClockIn ', to_timestamp('7/7/20 9am ','mm/dd/yy hhAM'),'Account7' from dual union all
select 3 ,'EmpC', 'ClockIn ', to_timestamp('7/7/20 9am ','mm/dd/yy hhAM'),'Account1' from dual union all
select 4 ,'EmpA', 'Switch ', to_timestamp('7/7/20 10am','mm/dd/yy hhAM'),'Account3' from dual union all
select 5 ,'EmpA', 'Switch ', to_timestamp('7/7/20 11am','mm/dd/yy hhAM'),'Account6' from dual union all
select 6 ,'EmpC', 'Switch ', to_timestamp('7/7/20 1pm ','mm/dd/yy hhAM'),'Account4' from dual union all
select 7 ,'EmpC', 'ClockOut', to_timestamp('7/7/20 3pm ','mm/dd/yy hhAM'),' ' from dual union all
select 8 ,'EmpD', 'ClockIn ', to_timestamp('7/7/20 5pm ','mm/dd/yy hhAM'),'Account5' from dual union all
select 9 ,'EmpD', 'Switch ', to_timestamp('7/7/20 6pm ','mm/dd/yy hhAM'),'Account6' from dual union all
select 10 ,'EmpB', 'ClockOut', to_timestamp('7/7/20 6pm ','mm/dd/yy hhAM'),' ' from dual union all
select 11 ,'EmpD', 'Switch ', to_timestamp('7/7/20 7pm ','mm/dd/yy hhAM'),'Account4' from dual union all
select 12 ,'EmpA', 'ClockOut', to_timestamp('7/7/20 8pm ','mm/dd/yy hhAM'),' ' from dual union all
select 13 ,'EmpD', 'Switch ', to_timestamp('7/8/20 1am ','mm/dd/yy hhAM'),'Account3' from dual union all
select 14 ,'EmpD', 'ClockOut', to_timestamp('7/8/20 2am ','mm/dd/yy hhAM'),' ' from dual union all
select 15 ,'EmpA', 'ClockIn ', to_timestamp('7/8/20 8am ','mm/dd/yy hhAM'),'Account1' from dual
)
--main query:
select li.*
, first_value(decode(TransActType,'ClockOut',"START") ignore nulls)
over(partition by EmpId order by "START" rows between current row and unbounded following) ClockOut
from labor li;
结果
| ID | EMPI | TRANSACT | START | ACCOUNT | CLOCKOUT
| ---- | ---- | -------- | ------------------- | -------- | -------------------
| 1 | EmpA | ClockIn | 2020-07-07 08:00:00 | Account1 | 2020-07-07 20:00:00
| 4 | EmpA | Switch | 2020-07-07 10:00:00 | Account3 | 2020-07-07 20:00:00
| 5 | EmpA | Switch | 2020-07-07 11:00:00 | Account6 | 2020-07-07 20:00:00
| 12 | EmpA | ClockOut | 2020-07-07 20:00:00 | | 2020-07-07 20:00:00
| 15 | EmpA | ClockIn | 2020-07-08 08:00:00 | Account1 |
| 2 | EmpB | ClockIn | 2020-07-07 09:00:00 | Account7 | 2020-07-07 18:00:00
| 10 | EmpB | ClockOut | 2020-07-07 18:00:00 | | 2020-07-07 18:00:00
| 3 | EmpC | ClockIn | 2020-07-07 09:00:00 | Account1 | 2020-07-07 15:00:00
| 6 | EmpC | Switch | 2020-07-07 13:00:00 | Account4 | 2020-07-07 15:00:00
| 7 | EmpC | ClockOut | 2020-07-07 15:00:00 | | 2020-07-07 15:00:00
| 8 | EmpD | ClockIn | 2020-07-07 17:00:00 | Account5 | 2020-07-08 02:00:00
| 9 | EmpD | Switch | 2020-07-07 18:00:00 | Account6 | 2020-07-08 02:00:00
| 11 | EmpD | Switch | 2020-07-07 19:00:00 | Account4 | 2020-07-08 02:00:00
| 13 | EmpD | Switch | 2020-07-08 01:00:00 | Account3 | 2020-07-08 02:00:00
| 14 | EmpD | ClockOut | 2020-07-08 02:00:00 | | 2020-07-08 02:00:00