我正在尝试格式化考勤系统中的出入时间。我已经能够将数据格式化为以下形式。
+----------+--------------+-----------+------------------+------+
| Emp_Name | IO_Date_Only | IO_Status | IO_Time | Flag |
+----------+--------------+-----------+------------------+------+
| AA | 08-01-2018 | Enter | 08-01-2018 11:44 | N |
| AA | 08-01-2018 | Exit | 08-01-2018 11:51 | N |
| AA | 08-01-2018 | Exit | 08-01-2018 11:52 | Y |
| AA | 08-02-2018 | Exit | 08-02-2018 11:44 | N |
| AA | 08-02-2018 | Exit | 08-02-2018 11:51 | Y |
| AA | 08-02-2018 | Exit | 08-02-2018 11:52 | Y |
| BB | 08-01-2018 | Exit | 08-01-2018 11:44 | N |
| BB | 08-01-2018 | Exit | 08-01-2018 11:51 | Y |
| BB | 08-01-2018 | Enter | 08-01-2018 11:52 | N |
| BB | 08-02-2018 | Enter | 08-02-2018 11:44 | N |
| BB | 08-02-2018 | Enter | 08-02-2018 11:51 | Y |
| BB | 08-02-2018 | Exit | 08-02-2018 11:52 | N |
| BB | 08-02-2018 | Enter | 08-02-2018 11:55 | N |
| BB | 08-02-2018 | Exit | 08-02-2018 11:57 | N |
+----------+--------------+-----------+------------------+------+
如果我采用第一行和第二行,则可以看到上面的数据,您会看到第一行是入口,第二行是出口。但是,第三行也是出口。当最终尝试提取此数据时,我想忽略第二行,但取第三行。
基本上我想做的是,如果有两个连续的入口,那么我需要拉第一行,如果有连续的出口,那么我需要拉该组的最后一个出口行,我已经将源格式化为上述格式使用Talend进行输出,但现在有点卡住了。
输出应如下所示
+----------+--------------+-----------+------------------+------+
| Emp_Name | IO_Date_Only | IO_Status | IO_Time | Flag |
+----------+--------------+-----------+------------------+------+
| AA | 08-01-2018 | Enter | 08-01-2018 11:44 | N |
| AA | 08-01-2018 | Exit | 08-01-2018 11:52 | Y |
| BB | 08-01-2018 | Enter | 08-01-2018 11:52 | N |
| BB | 08-02-2018 | Enter | 08-02-2018 11:44 | N |
| BB | 08-02-2018 | Exit | 08-02-2018 11:52 | N |
| BB | 08-02-2018 | Enter | 08-02-2018 11:55 | N |
| BB | 08-02-2018 | Exit | 08-02-2018 11:57 | N |
+----------+--------------+-----------+------------------+------+
答案 0 :(得分:0)
尝试
select emp_name, IO_Status, IO_Date_Only, min(IO_Time) as IN, max(IO_Time) as OUT
from table
group by emp_name, IO_Status, IO_Date_Only;