我知道这是其他人过去必须解决的问题,但在我有限的知识中,还没有克服困难。我有按datetime排序的数据,需要按两个字段(状态和队列)的组合进行分组。在状态和队列在给定时间范围内相同的情况下,它们应被视为同一组的一部分,因此具有相同的ID。
为了做到这一点,我试图实现DENSE_RANK(),并且为了所有意图和目的,它已经成功 - 除了组的顺序。以下是一个例子:
WITH TEMP1 (EVENT_DATE, PRV_EVENT_DATE, STATUS, PRV_STATUS, QUEUE, PRV_QUEUE) AS
(VALUES ('2012-09-04 11:40:19.936141', '', 'CREATED', '', 'SYSTEM', '')
,('2012-09-04 11:40:21.207140', '2012-09-04 11:40:19.936141', 'CREATED', 'CREATED', 'SYSTEM', 'SYSTEM')
,('2012-09-04 11:40:27.771140', '2012-09-04 11:40:21.207140', 'PROCESS', 'CREATED', 'PROCESS', 'SYSTEM')
,('2012-09-05 00:01:20.384180', '2012-09-04 11:40:27.771140', 'SUSPEND', 'PROCESS', 'SYSTEM', 'SYSTEM')
,('2012-09-05 00:02:14.042180', '2012-09-05 00:01:20.384180', 'SUSPEND', 'SUSPEND', 'PEND', 'SYSTEM')
,('2012-09-06 00:02:14.642180', '2012-09-05 00:02:14.042180', 'SUSPEND', 'SUSPEND', 'SYSTEM', 'SYSTEM')
,('2012-09-06 00:02:33.433180', '2012-09-06 00:02:14.642180', 'SUSPEND', 'SUSPEND', 'SYSTEM', 'SYSTEM')
)
SELECT
ROW_NUMBER() OVER (ORDER BY EVENT_DATE) AS "RN",
DENSE_RANK() OVER ( ORDER BY status, queue, date(event_date)) AS "GRP",
EVENT_DATE, PRV_EVENT_DATE, STATUS, PRV_STATUS, QUEUE, PRV_QUEUE
FROM TEMP1
ORDER BY EVENT_DATE
结果如下:
RN GRP EVENT_DATE PRV_EVENT_DATE STATUS PRV_STATUS QUEUE
1 1 2012-09-04 11:40:19.936141 CREATED SYSTEM
2 1 2012-09-04 11:40:21.207140 2012-09-04 11:40:19.936141 CREATED CREATED SYSTEM
3 2 2012-09-04 11:40:27.771140 2012-09-04 11:40:21.207140 PROCESS CREATED PROCESS
4 4 2012-09-05 00:01:20.384180 2012-09-04 11:40:27.771140 SUSPEND PROCESS SYSTEM
5 3 2012-09-05 00:02:14.042180 2012-09-05 00:01:20.384180 SUSPEND SUSPEND PEND
6 5 2012-09-06 00:02:14.642180 2012-09-05 00:02:14.042180 SUSPEND SUSPEND SYSTEM
正如您所知,“GRP”出现故障(我也知道使用日期(EVENT_DATE)不是解决方案)。
答案 0 :(得分:0)
目前尚不清楚(至少对我而言),你真正想要的是什么。只要“STATUS”或“QUEUE”与前一个相比发生变化,新组就会出现?还是有更复杂的规则?
看起来您的数据已经是查询的结果,您可以使用MIN(状态/队列)OVER(前1和下1之间的行)计算前一个值
当你施放到DATE时,你永远不会得到正确的顺序,尝试这样的计算:
SUM(CASE WHEN status = prv_status AND queue = prv_queue THEN 0 ELSE 1 END)
OVER (ORDER BY event_date
ROWS UNBOUNDED PRECEDING)
编辑: 如果没有SUM OVER,你必须使用标量子查询作为DENSE_RANK的输入,这应该有效:
SELECT
(SELECT MAX(event_date)
FROM TEMP1 AS t2
WHERE t2.event_date < t1.event_date
AND t1.status <> t2.status
AND t1.QUEUE <> t2.queue) AS x,
DENSE_RANK() OVER ( ORDER BY x) AS "GRP",
当然,表现可能很糟糕。
也许你最好保持“错误”的顺序,至少它对于一组的所有行都是相同的错误值: - )