我有一张结构表:
id, timestamp, deviceId, datatype, measure
column measure的值表示数据类型的值。例如,当处理开始时,数据类型为19并且测量1.当它完成时,数据类型仍为19,值为0,并且插入的新行具有相同的时间戳,数据类型54和值作为某个值。这意味着在完成时系统会调用一些触发器来更新此表。
下面的示例数据1001, 2013-01-02 09:20:00, 501, 19, 1
1005, 2013-01-02 10:00:00, 501, 19, 0
1006, 2013-01-02 10:00:00, 501, 54, 65
时间戳1005& 1006相同,1001的时间戳总是小于1005的时间戳
1011, 2013-01-02 09:20:00, 601, 19, 1
1015, 2013-01-02 10:00:00, 601, 19, 0
1016, 2013-01-02 10:00:00, 601, 54, 105
时间戳1015& 1016是相同的,1011的时间戳总是小于1015的时间戳
1021, 2013-01-02 09:20:00, 701, 19, 1
1022, 2013-01-02 10:00:00, 701, 19, 0
1023, 2013-01-02 10:00:00, 701, 54, 81
时间戳1022& 1023相同,1021的时间戳总是小于1022的时间戳
同一个过程可以同时针对多个设备进行。
现在要求是找到每个已完成交易的开始和结束时间,如
1006, 2013-01-02 09:20:00, 2013-01-02 10:20:00, 501, 65
1016, 2013-01-02 09:20:00, 2013-01-02 10:20:00, 601, 105
1023, 2013-01-02 09:20:00, 2013-01-02 10:20:00, 701, 81
我在5年后编写SQL查询并完全陷入困境。任何指针/建议都将受到高度赞赏。
提前致谢
答案 0 :(得分:2)
CREATE TABLE t
(id int, ts timestamp, deviceId int, datatype int, measure int)
;
INSERT INTO t
(id, ts, deviceId, datatype, measure)
VALUES
(1001, '2013-01-02 09:20:00', 501, 19, 1),
(1005, '2013-01-02 10:00:00', 501, 19, 0),
(1006, '2013-01-02 10:00:00', 501, 54, 65),
(1007, '2013-01-02 10:20:00', 501, 19, 1),
(1008, '2013-01-02 11:00:00', 501, 19, 0),
(1009, '2013-01-02 11:00:00', 501, 54, 65),
(1011, '2013-01-02 09:20:00', 601, 19, 1),
(1015, '2013-01-02 10:00:00', 601, 19, 0),
(1016, '2013-01-02 10:00:00', 601, 54, 105),
(1021, '2013-01-02 09:20:00', 701, 19, 1),
(1022, '2013-01-02 10:00:00', 701, 19, 0),
(1023, '2013-01-02 10:00:00', 701, 54, 81)
;
with parted as (
select floor((rn - 1) / 2.0) p, *
from (
select
row_number() over (partition by deviceId order by ts, datatype) rn,
id, ts, deviceId, dataType, measure
from t
where not(datatype = 19 and measure = 0)
) s
)
select
p1.id, p0.ts "start", p1.ts "end", p1.deviceId, p1.measure
from
parted p0
inner join
parted p1 on
p0.deviceId = p1.deviceId
and p0.p = p1.p
and p0.datatype = 19 and p1.datatype = 54
order by p1.id
;
id | start | end | deviceid | measure
------+---------------------+---------------------+----------+---------
1006 | 2013-01-02 09:20:00 | 2013-01-02 10:00:00 | 501 | 65
1009 | 2013-01-02 10:20:00 | 2013-01-02 11:00:00 | 501 | 65
1016 | 2013-01-02 09:20:00 | 2013-01-02 10:00:00 | 601 | 105
1023 | 2013-01-02 09:20:00 | 2013-01-02 10:00:00 | 701 | 81
答案 1 :(得分:0)
我的逻辑是一个简单的聚合。但是,聚合键是具有数据类型54的“下一个”记录,具有相同的设备ID。
要获得下一条记录,我在where
子句中使用了相关子查询:
select next54 as id, MIN(timestamp) as starttime, MAX(timestamp) as endtime, MAX(device_id) as device_id,
MAX(case when id = next54 then measure end)
from (select t.*,
(select MIN(id) from t t2 where t2.id >= t.id and t2.datatype = 54 and t2.device_id = t.device_id) as next54
from t
) t
group by next54
其余的是聚合。
因为我个人不是相关子查询的忠实粉丝,你也可以使用窗口函数(有时在Oracle中称为分析函数)来编写它:
select next54 as id, MIN(timestamp) as starttime, MAX(timestamp) as endtime, MAX(device_id) as device_id,
MAX(case when id = next54 then measure end)
from (select t.*,
min(id54) over (partition by device_id order by id desc) as next54
from (select t.*,
(case when datatype = 54 then id end) as id54
from t
) t
) t
group by next54
带有min
子句的order by
函数执行“累积”最小值。结果应与相关子查询相同。
答案 2 :(得分:0)
有可能我在这里大大简化了这个问题,但是我没有看到任何理由为每个具有数据类型54的记录你不能只访问数据类型为19且该数据类型为19的设备的先前记录措施是1:
SELECT result.ID,
result.DeviceID,
MAX(start.Timestamp) StartTime,
result.Timestamp EndTime,
result.Measure
FROM T result
INNER JOIN T start
ON start.DeviceID = result.DeviceID
AND start.Timestamp < result.Timestamp
AND start.DataType = 19
AND start.Measure = 1
WHERE result.DataType = 54
GROUP BY result.ID, result.DeviceID, result.Timestamp, result.Measure
唯一真正的区别在于,我不是试图通过从头开始并向前推进结果来解决问题,而是从结果开始并向后工作。如果进程同时针对同一设备运行(即一个事务在前一个事务结束之前开始),则会失败。