我有一个如下所述的数据集:我需要按以下条件对行进行分组:
如果时间不连续,请保持行不变。
ID In_time Out_Time Total_Mins seq_num
A 4/1/2014 10:00 4/5/2014 10:00 5760 1
B 4/2/2014 08:30 4/3/2014 08:30 1440 1
C 4/3/2014 09:00 4/3/2014 16:30 450 1
C 4/3/2014 16:30 4/4/2014 10:00 1050 2
C 4/4/2014 10:00 4/6/2014 18:00 3360 3
D 4/3/2014 02:00 4/4/2014 05:00 180 1
D 4/5/2014 06:00 4/5/2014 17:00 660 2
我尝试使用partition,first_value和last_value函数来获取下面的行是我的查询来获取行。
`select r1.id
,r1.in_time as in_time
,r2.out_time as out_time
,r1.totalTimeMins + r2.totalTimeMins as totalMins
from rdata r1
inner join rdata r2 on r1.id = r2.id
and r1.seqNum =r2.seqNum - 1
and r1.out_time = r2.in_time`
如果他们没有连续的时间框架,有人可以建议如何获得连续时间帧和其他行的1行吗?
答案 0 :(得分:0)
试试这个让我知道,
SELECT DISTINCT t.id,
first_value(t.in_time) OVER ( PARTITION BY t.id ORDER BY t.seq_num ) in_time,
first_value(t.out_time) OVER ( PARTITION BY t.id ORDER BY t.seq_num DESC ) out_time,
SUM(t.total_mins) OVER ( PARTITION BY t.id ) total_mins
FROM rdata t
答案 1 :(得分:0)
一些更高级的RDBMS系统具有奇特的内置函数,如分析和东西。我不知道您使用的系统,但您可以使用一些简单的ANSI标准SQL来完成任务。这就是我在想的......
在这种情况下,我们两次查询同一个表并将其连接到自身,但是移动索引/连接键,以便第一个记录与其后面的记录对齐。我们使用OUTER连接,因为在样本数据中,并非所有类别都有多个时间条目。
[SQL Fiddle][1]
MySQL 5.5.32架构设置:
CREATE TABLE punch_clock_timecard
(`ID` varchar(1), `In_time` datetime, `Out_Time` datetime,
`Total_Mins` int, `seq_num` int);
INSERT INTO punch_clock_timecard
(`ID`, `In_time`, `Out_Time`, `Total_Mins`, `seq_num`)
VALUES
('A', '2014-04-01 00:00:00', '2014-04-05 00:00:00', 5760, 1),
('B', '2014-04-01 22:30:00', '2014-04-02 22:30:00', 1440, 1),
('C', '2014-04-02 23:00:00', '2014-04-03 06:30:00', 450, 1),
('C', '2014-04-03 06:30:00', '2014-04-04 00:00:00', 1050, 2),
('C', '2014-04-04 00:00:00', '2014-04-06 08:00:00', 3360, 3),
('D', '2014-04-02 16:00:00', '2014-04-03 19:00:00', 180, 1),
('D', '2014-04-04 20:00:00', '2014-04-05 07:00:00', 660, 2);
查询1 :
select pct1.id, pct1.out_time, pct2.in_time as next_in_time,
pct1.seq_num, pct2.seq_num as next_seq_num
from punch_clock_timecard pct1
left outer join punch_clock_timecard pct2
on pct1.id = pct2.id
and (pct1.seq_num + 1) = pct2.seq_num
order by pct1.id asc, pct1.seq_num asc, pct2.seq_num asc;
<强> Results 强>:
| ID | OUT_TIME | NEXT_IN_TIME | SEQ_NUM | NEXT_SEQ_NUM |
|----|------------------------------|------------------------------|---------|--------------|
| A | April, 05 2014 00:00:00+0000 | (null) | 1 | (null) |
| B | April, 02 2014 22:30:00+0000 | (null) | 1 | (null) |
| C | April, 03 2014 06:30:00+0000 | April, 03 2014 06:30:00+0000 | 1 | 2 |
| C | April, 04 2014 00:00:00+0000 | April, 04 2014 00:00:00+0000 | 2 | 3 |
| C | April, 06 2014 08:00:00+0000 | (null) | 3 | (null) |
| D | April, 03 2014 19:00:00+0000 | April, 04 2014 20:00:00+0000 | 1 | 2 |
| D | April, 05 2014 07:00:00+0000 | (null) | 2 | (null) |
您可以将此查询包装在另一个选择并运行最终评估的SQL语句中(即OUT_TIME
与NEXT_IN_TIME
...)
冠!
答案 2 :(得分:0)
修改了一个...
SELECT k.id, MIN(k.in_time) in_time, MAX(k.out_time) out_time, SUM(k.total_mins) total_mins
FROM (
SELECT a.*, DECODE(b.seq,a.seq_num,'continuous'||a.id,'not continuous'||a.seq_num) flag
FROM rdata a,
( SELECT t.id tid, t.seq_num seq
FROM rdata t, rdata s
WHERE t.id = s.id
AND t.out_time = s.in_time
UNION
SELECT t.id tid, s.seq_num seq
FROM rdata t, rdata s
WHERE t.id = s.id
AND t.out_time = s.in_time ) b
WHERE a.id = b.tid(+)
AND a.seq_num = b.seq(+) ) k
GROUP BY k.id, k.flag
ORDER BY 1,2,3