我之前发布过这个问题,但我认为这个想法已经错过了。
这是我的新解释。
以下脚本应DBMS_OUTPUT
,只有缺失的序列..但在我测试时,我试图删除一条记录。但是这个脚本仍然没有打印出这个序列号丢失的信息。
http://sqlfiddle.com/#!4/3c9e2/1/0
create or replace procedure show_missing_seqs(yy in varchar2 default '[0-9]{2}',
mm in varchar2 default '[0-9]{2}',
dd in varchar2 default '[0-9]{2}') as
pattern varchar2(80);
min_seq number(4):=1;
max_seq number(4):=9999;
cursor cur(pattern varchar2) is with
t as(
select to_number(substr(filename, 5, 4)) as seq,
substr(filename, 10, 2) as yy,
substr(filename, 13, 2) as mm,
substr(filename, 16, 2) as dd
from test1@ra2013
where regexp_like(filename, pattern)),
r(yy, mm, dd, seq, max_seq) as(
select yy, mm, dd, min_seq, max_seq
from t
group by yy, mm, dd
union all
select yy, mm, dd, seq + 1, max_seq
from r
where seq + 1 <= max_seq)
select yy, mm, dd, seq as missing_seq
from r
where not exists (select 1
from t
where t.yy = r.yy
and t.mm = r.mm
and t.dd = r.dd
and t.seq = r.seq)
order by yy, mm, dd, seq;
begin
pattern := 'CDR[-][0-9]{4}[_][0-9]{2}' || yy || '[_][0-9]{2}' || mm ||
'[_][0-9]{2}[_][0-9]{4}[_][N]["2"]';
for rec in cur(pattern) loop
dbms_output.put_line(rec.missing_seq);
end loop;
dbms_output.put_line('Done');
end show_missing_seqs;
答案 0 :(得分:0)
如果您从SQLPLus运行PLSQL脚本,请确保您已启用serveroutput以实际查看其输出的内容。
sql&gt;在
上设置serveroutput答案 1 :(得分:0)
你是说你缺少从510到4356的缺失序列?
通过完全运行代码......我可以看到输出。它打印所有缺失的序列,直到509.你没有看到509之后因为联合所有操作员。 Union all将前一个结果(表t)的最后一行传递到表r中,因此在509之前缺少seq。为了说明这一点,我明确地将表t数据称为union all的第二部分,并且还打印了max_seq。从结果中,您将看到前两行来自UNION ALL的上部,其余部分来自UNION ALL的第二部分,并在max_seq列处观察。
A 14 01 17 4356 4356 4356
A 14 02 07 397 509 509
B 14 02 07 398 509 509
with t as
(
SELECT
to_number(substr(ename, 5,4 )) as seq,
substr(ename, 10, 2) as yy,
substr(ename, 13, 2) as mm,
substr(ename, 16, 2) as dd
from table1
where regexp_like(ename, 'CDR[-][0-9]{4}[_][0-9]{2}[_][0-9]{2}[_][0-9]{2}[_][0-9]{4}[_][N]["2"]')
),
r (col, yy, mm, dd, seq, max_seq) as (
SELECT 'A', yy, mm, dd, min(seq), max(seq)
from t
group by yy, mm, dd
union all
select 'B', yy, mm, dd,seq + 1, max_seq
from r
where seq + 1 <= (SELECT max(max_seq) FROM t)
)
select col, yy, mm, dd, seq, max_seq as missing_seq, max_seq
from r
不确定为什么你比较使用YY-MM-DD,但这里是我用过的sql如果我想模仿你在做什么 - 唯一不同的是t2,我确保最后一行应该是使用max-seq
with t as
(
SELECT
to_number(substr(ename, 5,4 )) as seq,
substr(ename, 10, 2) as yy,
substr(ename, 13, 2) as mm,
substr(ename, 16, 2) as dd
from table1
where regexp_like(ename, 'CDR[-][0-9]{4}[_][0-9]{2}[_][0-9]{2}[_][0-9]{2}[_][0-9]{4}[_][N]["2"]')
), t2 AS (
select max(yy) yy, max(mm) mm, max(dd) dd, min(seq) seq, max(seq) max_seq from t ORDER BY 5 ASC
),
r (yy, mm, dd, seq, max_seq) as (
select yy, mm, dd, (seq), (max_seq) from t2
union all
select yy, mm, dd,seq + 1, max_seq
from r
where seq + 1 <= max_seq
)
select yy, mm, dd, seq as missing_seq, max_seq
from r
where not exists (
select 1 from t2 t
where t.yy = r.yy
and t.mm = r.mm
and t.dd = r.dd
and t.seq = r.seq
)
order by yy, mm, dd, seq
如果有帮助,请告诉我。
答案 2 :(得分:0)
基于对一天内序列循环可能性的评论,查询变得越来越复杂,但我认为这就是你要找的东西:
with t as
(
select to_number(substr(ename, 5,4 )) as seq,
to_date(substr(ename, 10, 2) || substr(ename, 13, 2)
|| substr(ename, 16, 2) || substr(ename, 19, 4), 'RRMMDDHH24MI')
as cdr_time
from table1
where regexp_like(ename,
'CDR[-][0-9]{4}[_][0-9]{2}[_][0-9]{2}[_][0-9]{2}[_][0-9]{4}[_][N]["2"]')
order by substr(ename, 10, 13)
),
u as (
select cdr_date, cdr_time,
case when lag(seq) over (order by cdr_time) is null then seq
when lag(seq) over (order by cdr_time) > seq then 0
when trunc(lag(cdr_time) over (order by cdr_time)) != trunc(cdr_time)
then mod(lag(seq) over (order by cdr_time) + 1, 10000)
end as min_seq,
case when lead(seq) over (order by cdr_time) is null then seq
when lead(seq) over (order by cdr_time) < seq then 9999
when trunc(lead(cdr_time) over (order by cdr_time)) != trunc(cdr_time)
then seq
end as max_seq
from t
),
v as (
select cdr_time as min_cdr_time,
lead(cdr_time) over (order by cdr_time) as max_cdr_time,
min_seq,
lead(max_seq) over (order by cdr_time) as max_seq
from u
where min_seq is not null or max_seq is not null
),
w (min_cdr_time, max_cdr_time, seq, max_seq) as (
select min_cdr_time, max_cdr_time, min_seq, max_seq
from v
where min_seq is not null
union all
select min_cdr_time, max_cdr_time, seq + 1, max_seq
from w
where seq + 1 <= max_seq
)
select to_char(min_cdr_time, 'YYYY-MM-DD') as cdr_date,
to_char(seq, 'FM0000') as missing_seq
from w
where not exists (
select 1
from t
where t.cdr_time between w.min_cdr_time and w.max_cdr_time
and t.seq = w.seq
)
order by min_cdr_time, seq;
所有CTE再次;大多数可能是子查询,但w
是递归的,所以可能更容易将它们全部保留下来。
CTE't'将ename
转换为日期/时间值和序列值;这用于生成范围,以及稍后检查缺失值。
CTE u
根据生成的t
将cdr_time
中的每条记录与上一张和下一张记录进行比较,以确定日期何时更改或序列已循环。这有点棘手,特别是当两者重合时,但是从一点点的测试中我认为这是有效的。
CTE v
只是将u
的结果压缩为每个日期/周期块的单个记录。 (在下面的SQL小提琴中,这必须将两个日期值转换为日期,这没有意义;不确定这是否与该环境有关,我本身没有运行它的问题)。
CTE w
生成每个数据/循环块中的所有序列值,与前一个问题的答案几乎相同。并且最终查询会查找这些范围内的缺失值,再次与之前相同。
SQL Fiddle对于一大块CDR,其中有一些被删除以制作应报告的缺失值。为了让生活更轻松,我每分钟创造了一条记录,但它会有更大的差距。我对您当前的小提琴中的数据进行了CPU资源限制,因此您可能在您的环境中运行它时遇到问题 - 取决于您认为有多少数据和多少空白。
如果你把它放在一个程序中并希望将输出限制在一天的间隙,你可能想要在u
中添加限制,但每天都有一天;就在那一天,和也在v
中。希望这会给你一些你可以适应的东西。