为示例而简化...
我有一张桌子,t1:
Reference Ticket TicketDate Outcome Source
1 1 2017-01-01 0 A
1 2 2017-01-02 0 A
1 3 2017-01-03 1 A
2 4 2017-01-01 0 A
2 4 2017-01-01 0 B
2 4 2017-01-01 0 C
2 5 2017-01-02 0 B
2 6 2017-01-03 1 B
3 7 2017-01-01 0 A
3 8 2017-01-02 0 A
3 9 2017-01-03 1 B
我想要做的是按照来源对结果进行分类,其中最新结果是1,其中它曾经是0 ...
例如:
with CTE as
(
select t1.*, dense_rank() over(partition by reference order by ticketdate desc) as rn
from t1
)
select c1.reference, c1.outcome, count(distinct c2.ticket) as now1was0
from CTE c1
left join CTE c2
on c1.reference = c2.reference
and c2.rn > c1.rn
and c2.outcome = 0
and c2.ticket <> c1.ticket
where c1.outcome = 1
group by reference, outcome
哪种方法很好,但现在我想将先前显示相同来源的计数与不存在相同来源的计数分开。
例如:
reference outcome now1was0same now1was0different
1 1 1 0
2 1 1 0
3 1 0 1
如果结果的来源= 1存在于参考的任何先前的Outcome = 0行中,则需要出现。
有人可以帮助我从我所在的地方到达我需要的地方吗?
修改的
有些引用可能会有后续结果= 0,但我打算完全忽略这些结果,并且会在让我达到这一点的CTE级联中处理这个问题
答案 0 :(得分:1)
您的示例数据仅显示最新条目的outcome=1
;使用这个假设,您可以使用分析函数完成所有操作并摆脱自联接:
SELECT reference,
outcome,
same_src AS now1was0same,
all_src - same_src AS now1was0different
FROM (
SELECT reference,
outcome,
ROW_NUMBER() OVER ( PARTITION BY reference ORDER BY TicketDate DESC ) AS rn,
COUNT( CASE outcome WHEN 0 THEN 1 END ) OVER ( PARTITION BY reference, source )
AS same_src,
COUNT( CASE outcome WHEN 0 THEN 1 END ) OVER ( PARTITION BY reference )
AS all_src
FROM t1
)
WHERE rn = 1
AND outcome = 1;
答案 1 :(得分:1)
我认为这可能是你所追求的 - 再次,它使用分析函数来完成工作:
WITH t1 AS (SELECT 1 REFERENCE, 1 ticket, DATE '2017-01-01' ticketdate, 0 outcome, 'A' SOURCE FROM dual UNION ALL
SELECT 1 REFERENCE, 2 ticket, DATE '2017-01-02' ticketdate, 0 outcome, 'A' SOURCE FROM dual UNION ALL
SELECT 1 REFERENCE, 3 ticket, DATE '2017-01-03' ticketdate, 1 outcome, 'A' SOURCE FROM dual UNION ALL
SELECT 2 REFERENCE, 4 ticket, DATE '2017-01-01' ticketdate, 0 outcome, 'A' SOURCE FROM dual UNION ALL
SELECT 2 REFERENCE, 4 ticket, DATE '2017-01-01' ticketdate, 0 outcome, 'B' SOURCE FROM dual UNION ALL
SELECT 2 REFERENCE, 4 ticket, DATE '2017-01-01' ticketdate, 0 outcome, 'C' SOURCE FROM dual UNION ALL
SELECT 2 REFERENCE, 5 ticket, DATE '2017-01-02' ticketdate, 0 outcome, 'B' SOURCE FROM dual UNION ALL
SELECT 2 REFERENCE, 6 ticket, DATE '2017-01-03' ticketdate, 1 outcome, 'B' SOURCE FROM dual UNION ALL
SELECT 3 REFERENCE, 7 ticket, DATE '2017-01-01' ticketdate, 0 outcome, 'A' SOURCE FROM dual UNION ALL
SELECT 3 REFERENCE, 8 ticket, DATE '2017-01-02' ticketdate, 0 outcome, 'A' SOURCE FROM dual UNION ALL
SELECT 3 REFERENCE, 9 ticket, DATE '2017-01-03' ticketdate, 1 outcome, 'B' SOURCE FROM dual UNION ALL
SELECT 4 REFERENCE, 10 ticket, DATE '2017-01-01' ticketdate, 0 outcome, 'A' SOURCE FROM dual UNION ALL
SELECT 4 REFERENCE, 11 ticket, DATE '2017-01-02' ticketdate, 1 outcome, 'A' SOURCE FROM dual UNION ALL
SELECT 4 REFERENCE, 12 ticket, DATE '2017-01-03' ticketdate, 1 outcome, 'B' SOURCE FROM dual UNION ALL
SELECT 5 REFERENCE, 13 ticket, DATE '2017-01-01' ticketdate, 0 outcome, 'C' SOURCE FROM dual UNION ALL
SELECT 5 REFERENCE, 14 ticket, DATE '2017-01-02' ticketdate, 1 outcome, 'B' SOURCE FROM dual UNION ALL
SELECT 5 REFERENCE, 15 ticket, DATE '2017-01-03' ticketdate, 0 outcome, 'B' SOURCE FROM dual),
res AS (SELECT REFERENCE,
ticket,
ticketdate,
outcome,
SOURCE,
CASE WHEN outcome = 1 THEN LAG(CASE WHEN outcome = 0 THEN 0 END IGNORE NULLS) OVER (PARTITION BY REFERENCE ORDER BY ticketdate) END prior_0_oc,
CASE WHEN outcome = 1 THEN LAG(CASE WHEN outcome = 0 THEN SOURCE END IGNORE NULLS) OVER (PARTITION BY REFERENCE ORDER BY ticketdate) END prior_0_src,
CASE WHEN outcome = 1 THEN LEAD(CASE WHEN outcome = 0 THEN 'Y' END IGNORE NULLS) OVER (PARTITION BY REFERENCE ORDER BY ticketdate) END next_0_present
FROM t1)
SELECT REFERENCE,
outcome,
COUNT(CASE WHEN prior_0_oc = 0 AND prior_0_src = SOURCE THEN 1 END) now1was0samesrc,
COUNT(CASE WHEN prior_0_oc = 0 AND prior_0_src != SOURCE THEN 1 END) now1was0diffsrc
FROM res
WHERE outcome = 1
AND next_0_present IS NULL
GROUP BY REFERENCE,
outcome
ORDER BY REFERENCE;
这会产生:
REFERENCE OUTCOME NOW1WAS0SAMESRC NOW1WAS0DIFFSRC
---------- ---------- --------------- ---------------
1 1 1 0
2 1 1 0
3 1 0 1
4 1 1 1
此查询找到(对于结果= 1行)结果为0的第一行并从中选择结果和来源,然后查找是否存在后续行0结果(因此我们可以排除那些从报告中 - 我认为这就是你的意思?)。
然后它会过滤掉所有0个结果行和任何结果= 1行,其中包含0个结果行,然后再进行条件计数以查找您所追踪的案例。
根据您要显示的结果与您要检查的两种方案中的任何一种都不匹配,您可能希望在最终的sql语句中包含having子句,以排除计数均为0的行。
ETA:
如果要计算给定引用的行在0和1之间多次翻转(即0,0,1,0,1)但仍忽略以0结果结束的引用的情况,则更改:
CASE WHEN outcome = 1 THEN LEAD(CASE WHEN outcome = 0 THEN 'Y' END IGNORE NULLS) OVER (PARTITION BY REFERENCE ORDER BY ticketdate) END next_0_present
为:
CASE WHEN outcome = 1 AND row_number() OVER (PARTITION BY REFERENCE, outcome ORDER BY ticketdate DESC) = 1
THEN LAG(CASE WHEN outcome = 0 THEN 'Y' END IGNORE NULLS) OVER (PARTITION BY REFERENCE ORDER BY ticketdate DESC) END next_0_present
当然,如果我完全按照你的意思得到错误的结局:
某些引用可能会有后续结果= 0,但我打算完全忽略这些
你只是想忽略额外的0行,当然你可以从查询中删除next_0_present列。