Question

我有一个每天要查询的员工时钟数据库表。但是，当员工的工作班次跨入午夜时，查询将丢失确定该员工总时数所需的交易。

ID NUMBER (PK)
EmpId VARCHAR
TransActType VARCHAR
Start TIMESTAMP
Account VARCHAR

1   EmpA ClockIn  7/7/20 8am  Account1
2   EmpB ClockIn  7/7/20 9am  Account7
3   EmpC ClockIn  7/7/20 9am  Account1
4   EmpA Switch   7/7/20 10am Account3
5   EmpA Switch   7/7/20 11am Account6
6   EmpC Switch   7/7/20 1pm  Account4
7   EmpC ClockOut 7/7/20 3pm 
8   EmpD ClockIn  7/7/20 5pm  Account5
9   EmpD Switch   7/7/20 6pm  Account6
10  EmpB ClockOut 7/7/20 6pm
11  EmpD Switch   7/7/20 7pm  Account4
12  EmpA ClockOut 7/7/20 8pm
13  EmpD Switch   7/8/20 1am  Account3
14  EmpD ClockOut 7/8/20 2am
15  EmpA ClockIn  7/8/20 8am  Account1
...

我要查询20/7/7的劳动力是

SELECT * FROM labor li where li.start between 7/7/20 12 am and 7/7/20 11:59 pm order by empId, start

并且只会抓取记录1-12，但也应该抓取13和14。

我的应用程序代码通过循环查询结果并针对员工连续事务的开始时间来计算每个员工当天的帐户工期。

没有记录13和14，我无法确定员工D的account4和account3持续时间。

也许我的数据库设计有缺陷，因为我只存储事务的开始时间，然后使用应用程序代码逻辑来计算持续时间。我决定以这种方式进行设计，这样，如果记录被移动/插入/删除，那么如果也保留结束时间，则个人保留的员工人工项目重叠的机会就较小。上面的人工数据样本显示，员工可以全天更改帐户并每天进行多次人工交易。

我希望查询一个可以按每个员工及时查询的查询，如果该员工的最后一笔交易不是时间戳范围内的“ ClockOut”，请继续抓取记录，直到找到一个为止。

基于同样的理由，我不能让该员工的第一笔交易是前一天的ClockOut。

如果这些查询几乎是不可能的，并且应该更改数据库设计，那么我想知道这一点。

Answer 1

I am hoping for a query that could look ahead in time on a per employee basis and if the last transaction of that employee is not a "ClockOut" for the timestamp range, keep grabbing records until one is found.

with 
  dates(d1, d2) as (select date '2020-07-07', date '2020-07-08' from dual), 
  main as (select id, empid, transacttype, start_, account, 
                  max(transacttype) keep (dense_rank last order by start_) 
                  over (partition by empid) mtt
           from li join dates on d1 <= start_ and start_ < d2),
  miss as (select empid, max(id) mnid from main where mtt = 'ClockIn' group by empid),
  cout as (select empid, min(id) mxid 
            from li join dates on start_ >= d2 join miss using (empid) 
            where transacttype = 'ClockOut' 
            group by empid) 
select id, empid, transacttype, start_, account from main union all
select li.id, li.empid, li.transacttype, li.start_, li.account from li
  join miss on li.empid = miss.empid and li.id > mnid
  join cout on li.empid = cout.empid and li.id < mxid
  order by empid, start_

dbfiddle

此查询如您所说工作。我找到了那个时期的所有数据，还找到了转写型的最后一个值。如果不是ClockOut，那么在接下来的步骤中，我将寻找min(id)，以查找第二天失踪的员工。最后是主要数据的合并和丢失。

请小心，因为在您的示例中有时为Clockout，有时为ClockOut。您可以使用upper()或其他在实际数据中发生的东西。

如果您不想依赖id，而是依赖日期，则可以使用上述查询的 date 版本：

with 
  dates(d1, d2) as (select date '2020-07-07', date '2020-07-08' from dual), 
  main as (select id, empid, transacttype, start_, account, 
                  max(transacttype) keep (dense_rank last order by start_) 
                  over (partition by empid) mtt
           from li join dates on d1 <= start_ and start_ < d2),
  miss as (select empid, max(start_) mnst from main where mtt = 'ClockIn' group by empid),
  cout as (select empid, min(start_) mxst 
            from li join dates on start_ >= d2 join miss using (empid) 
            where transacttype = 'ClockOut' 
            group by empid) 
select id, empid, transacttype, start_, account from main union all
select li.id, li.empid, li.transacttype, li.start_, li.account from li
  join dates on start_ >= d2
  join cout on li.empid = cout.empid and li.start_ <= mxst
  order by empid, start_

dbfiddle

Answer 2

如果您使用的是Oracle的最新版本，则可以使用match_recognize()来跟踪时钟输入/切换/时钟输出，然后根据时钟输入时间进行过滤：

select id, empid, transacttype, start_time, account
from labor
match_recognize (
  partition by empid
  order by start_time
  measures
    first(start_time) as grp_clockin
  all rows per match
  after match skip past last row
  pattern (clockin switch* clockout*)
  define
    clockin as clockin.transacttype = 'ClockIn',
    switch as switch.transacttype = 'Switch',
    clockout as clockout.transacttype = 'ClockOut'
)
where grp_clockin >= date '2020-07-07'
and grp_clockin < date '2020-07-08'
order by empid, grp_clockin, start_time;

ID | EMPID | TRANSACTTYPE | START_TIME          | ACCOUNT 
-: | :---- | :----------- | :------------------ | :-------
 1 | EmpA  | ClockIn      | 2020-07-07 08:00:00 | Account1
 4 | EmpA  | Switch       | 2020-07-07 10:00:00 | Account3
 5 | EmpA  | Switch       | 2020-07-07 11:00:00 | Account6
12 | EmpA  | ClockOut     | 2020-07-07 20:00:00 | null    
 2 | EmpB  | ClockIn      | 2020-07-07 09:00:00 | Account7
10 | EmpB  | ClockOut     | 2020-07-07 18:00:00 | null    
 3 | EmpC  | ClockIn      | 2020-07-07 09:00:00 | Account1
 6 | EmpC  | Switch       | 2020-07-07 13:00:00 | Account4
 7 | EmpC  | ClockOut     | 2020-07-07 15:00:00 | null    
 8 | EmpD  | ClockIn      | 2020-07-07 17:00:00 | Account5
 9 | EmpD  | Switch       | 2020-07-07 18:00:00 | Account6
11 | EmpD  | Switch       | 2020-07-07 19:00:00 | Account4
13 | EmpD  | Switch       | 2020-07-08 01:00:00 | Account3
14 | EmpD  | ClockOut     | 2020-07-08 02:00:00 | null

当过滤器发生得较晚时，您可以使用内联视图将其范围缩小，至少缩小到最小日期/时间：

select id, empid, transacttype, start_time, account
from (
  select *
  from labor
  where start_time >= date '2020-07-07'
)
match_recognize (
  partition by empid
  order by start_time
  measures
    first(start_time) as grp_clockin
  all rows per match
  after match skip past last row
  pattern (clockin switch* clockout*)
  define
    clockin as clockin.transacttype = 'ClockIn',
    switch as switch.transacttype = 'Switch',
    clockout as clockout.transacttype = 'ClockOut'
)
where grp_clockin >= date '2020-07-07'
and grp_clockin < date '2020-07-08'
order by empid, grp_clockin, start_time;

db<>fiddle

如果您能提出一个合理的选择，也可以选择最大范围，比如说一天以后排除任何东西：

select id, empid, transacttype, start_time, account
from (
  select *
  from labor
  where start_time >= date '2020-07-07'
  and start_time < date '2020-07-09'
)
...

我的应用程序代码通过遍历查询结果并针对连续的员工交易的开始时间来计算每位员工当天的帐户工期

如果我了解您在做什么，那么您也可以在查询中完成所有操作：

select empid, grp_start_time, grp_end_time, grp_account,
  (grp_end_time - grp_start_time) * interval '1' day as elapsed
from labor
match_recognize (
  partition by empid
  order by start_time
  measures
    first(start_time) as grp_start_time,
    first(account) as grp_account,
    final last(start_time) as grp_end_time
  one row per match
  after match skip to last grp_end
  pattern (grp_start grp_end)
  define
    grp_start as grp_start.transacttype in ('ClockIn', 'Switch'),
    grp_end as grp_end.transacttype in ('Switch', 'ClockOut')
)
where grp_start_time >= date '2020-07-07'
and grp_start_time < date '2020-07-08'
order by empid, grp_start_time;

EMPID | GRP_START_TIME      | GRP_END_TIME        | GRP_ACCOUNT | ELAPSED                      
:---- | :------------------ | :------------------ | :---------- | :----------------------------
EmpA  | 2020-07-07 08:00:00 | 2020-07-07 10:00:00 | Account1    | +000000000 02:00:00.000000000
EmpA  | 2020-07-07 10:00:00 | 2020-07-07 11:00:00 | Account3    | +000000000 01:00:00.000000000
EmpA  | 2020-07-07 11:00:00 | 2020-07-07 20:00:00 | Account6    | +000000000 09:00:00.000000000
EmpB  | 2020-07-07 09:00:00 | 2020-07-07 18:00:00 | Account7    | +000000000 09:00:00.000000000
EmpC  | 2020-07-07 09:00:00 | 2020-07-07 13:00:00 | Account1    | +000000000 04:00:00.000000000
EmpC  | 2020-07-07 13:00:00 | 2020-07-07 15:00:00 | Account4    | +000000000 02:00:00.000000000
EmpD  | 2020-07-07 17:00:00 | 2020-07-07 18:00:00 | Account5    | +000000000 01:00:00.000000000
EmpD  | 2020-07-07 18:00:00 | 2020-07-07 19:00:00 | Account6    | +000000000 01:00:00.000000000
EmpD  | 2020-07-07 19:00:00 | 2020-07-08 01:00:00 | Account4    | +000000000 06:00:00.000000000

db<>fiddle

Answer 3

这听起来像是要抵消日期，所以开始时间是从凌晨2:00到凌晨8:00。无法确切说出位置，所以假设是凌晨5:00。

只需减去5个小时即可进行日期比较。要么

where li.start - interval '5' hour >= date '2020-07-07' and 
      li.start - interval '5' hour < date '2020-07-08'

或者，以更加索引友好的方式：

where li.start >= date '2020-07-07' + interval '5' hour 
      li.start < date '2020-07-08' + interval '5' hour

Answer 4

我认为这可以满足您的需求：sqlfiddle

我不确定这是最佳解决方案，但也许您可以使用它来构建您的实际解决方案。如果考虑到评论中对需求的澄清，“ Ponder Stibbons”答案可能会比这更好（“所以如果今天是7/8/20，我不希望接13和14”）。

"Files": [
        {
            "pattern": "/Something/Something/1.dll"
        },
        {
            "pattern": "/Something/Something/2.dll"
        },
       .
       .
       .
        {
            "pattern": "/Something/Something/6.dll"
        }
    ]
}

Answer 5

您可以为此使用简单的分析功能“ FIRST_VALUE”：

select li.*
  , first_value(decode(TransActType,'ClockOut',"START") ignore nulls)
       over(partition by EmpId order by "START" rows between current row and unbounded following) ClockOut
from labor li;

您会看到它添加了新列ClockOut，其中包含下一个ClockOut的时间，因此您可以轻松地将所需数据添加到查询中。

完整测试用例：

-- sample data:
with labor(ID,EmpId,TransActType,"START",Account) as (
   select 1   ,'EmpA', 'ClockIn ', to_timestamp('7/7/20 8am ','mm/dd/yy hhAM'),'Account1' from dual union all 
   select 2   ,'EmpB', 'ClockIn ', to_timestamp('7/7/20 9am ','mm/dd/yy hhAM'),'Account7' from dual union all 
   select 3   ,'EmpC', 'ClockIn ', to_timestamp('7/7/20 9am ','mm/dd/yy hhAM'),'Account1' from dual union all 
   select 4   ,'EmpA', 'Switch  ', to_timestamp('7/7/20 10am','mm/dd/yy hhAM'),'Account3' from dual union all 
   select 5   ,'EmpA', 'Switch  ', to_timestamp('7/7/20 11am','mm/dd/yy hhAM'),'Account6' from dual union all 
   select 6   ,'EmpC', 'Switch  ', to_timestamp('7/7/20 1pm ','mm/dd/yy hhAM'),'Account4' from dual union all 
   select 7   ,'EmpC', 'ClockOut', to_timestamp('7/7/20 3pm ','mm/dd/yy hhAM'),'        ' from dual union all 
   select 8   ,'EmpD', 'ClockIn ', to_timestamp('7/7/20 5pm ','mm/dd/yy hhAM'),'Account5' from dual union all 
   select 9   ,'EmpD', 'Switch  ', to_timestamp('7/7/20 6pm ','mm/dd/yy hhAM'),'Account6' from dual union all 
   select 10  ,'EmpB', 'ClockOut', to_timestamp('7/7/20 6pm ','mm/dd/yy hhAM'),'        ' from dual union all 
   select 11  ,'EmpD', 'Switch  ', to_timestamp('7/7/20 7pm ','mm/dd/yy hhAM'),'Account4' from dual union all 
   select 12  ,'EmpA', 'ClockOut', to_timestamp('7/7/20 8pm ','mm/dd/yy hhAM'),'        ' from dual union all 
   select 13  ,'EmpD', 'Switch  ', to_timestamp('7/8/20 1am ','mm/dd/yy hhAM'),'Account3' from dual union all 
   select 14  ,'EmpD', 'ClockOut', to_timestamp('7/8/20 2am ','mm/dd/yy hhAM'),'        ' from dual union all 
   select 15  ,'EmpA', 'ClockIn ', to_timestamp('7/8/20 8am ','mm/dd/yy hhAM'),'Account1' from dual 
)
--main query:
select li.*
  , first_value(decode(TransActType,'ClockOut',"START") ignore nulls)
       over(partition by EmpId order by "START" rows between current row and unbounded following) ClockOut
from labor li;

结果

|   ID | EMPI | TRANSACT | START               | ACCOUNT  | CLOCKOUT
| ---- | ---- | -------- | ------------------- | -------- | -------------------
|    1 | EmpA | ClockIn  | 2020-07-07 08:00:00 | Account1 | 2020-07-07 20:00:00
|    4 | EmpA | Switch   | 2020-07-07 10:00:00 | Account3 | 2020-07-07 20:00:00
|    5 | EmpA | Switch   | 2020-07-07 11:00:00 | Account6 | 2020-07-07 20:00:00
|   12 | EmpA | ClockOut | 2020-07-07 20:00:00 |          | 2020-07-07 20:00:00
|   15 | EmpA | ClockIn  | 2020-07-08 08:00:00 | Account1 | 
|    2 | EmpB | ClockIn  | 2020-07-07 09:00:00 | Account7 | 2020-07-07 18:00:00
|   10 | EmpB | ClockOut | 2020-07-07 18:00:00 |          | 2020-07-07 18:00:00
|    3 | EmpC | ClockIn  | 2020-07-07 09:00:00 | Account1 | 2020-07-07 15:00:00
|    6 | EmpC | Switch   | 2020-07-07 13:00:00 | Account4 | 2020-07-07 15:00:00
|    7 | EmpC | ClockOut | 2020-07-07 15:00:00 |          | 2020-07-07 15:00:00
|    8 | EmpD | ClockIn  | 2020-07-07 17:00:00 | Account5 | 2020-07-08 02:00:00
|    9 | EmpD | Switch   | 2020-07-07 18:00:00 | Account6 | 2020-07-08 02:00:00
|   11 | EmpD | Switch   | 2020-07-07 19:00:00 | Account4 | 2020-07-08 02:00:00
|   13 | EmpD | Switch   | 2020-07-08 01:00:00 | Account3 | 2020-07-08 02:00:00
|   14 | EmpD | ClockOut | 2020-07-08 02:00:00 |          | 2020-07-08 02:00:00

Oracle按日期范围进行查询，但在需要时获取此范围之外的后续记录

5 个答案: