我有数以百万计的字符串记录,就像这一个有310种类型,它们具有不同的格式来获取序列,年,月和日...
脚本将获得序列,年,月和日...现在我想要一个Pl / Sql,它将获得序列的最大值和最小值,并找到缺少的数字,例如年份和月份14 - 06怎么??
答案 0 :(得分:1)
你根本不想看dual
;当然不会试图插入。您需要跟踪迭代循环时看到的最高值和最低值。根据{{1}}代表日期的某些元素,我非常确定您希望所有匹配都是ename
,而不是0-9
。您还在访问其字段时引用游标名称,而不是记录变量名称:
1-9
使用 FOR List_ENAME_rec IN List_ENAME_cur loop
if REGEXP_LIKE(List_ENAME_rec.ENAME,'emp[-][0-9]{4}[_][0-9]{2}[_][0-9]{2}[_][0-9]{2}[_][0-9]{4}[_][G][1]') then
V_seq := substr(List_ENAME_rec.ename,5,4);
V_Year := substr(List_ENAME_rec.ename,10,2);
V_Month := substr(List_ENAME_rec.ename,13,2);
V_day := substr(List_ENAME_rec.ename,16,2);
if min_seq is null or V_seq < min_seq then
min_seq := v_seq;
end if;
if max_seq is null or V_seq > max_seq then
max_seq := v_seq;
end if;
end if;
end loop;
和emp-1111_14_01_01_1111_G1
表中的值,报告emp-1115_14_02_02_1111_G1
。
如果你真的想涉及双重,你可以在循环内部而不是if / then / assign模式,但没有必要:
max_seq 1115 min_seq 1111
我不知道该程序将要做什么; select least(min_seq, v_seq), greatest(max_seq, v_seq)
into min_seq, max_seq
from dual;
中的任何内容与您找到的值之间似乎没有任何关系。
虽然你不需要任何PL / SQL。您可以从简单的查询中获取最小/最大值:
test1
您可以使用它们生成该范围内所有值的列表:
select min(to_number(substr(ename, 5, 4))) as min_seq,
max(to_number(substr(ename, 5, 4))) as max_seq
from table1
where status = 2
and regexp_like(ename,
'emp[-][0-9]{4}[_][0-9]{2}[_][0-9]{2}[_][0-9]{2}[_][0-9]{4}[_][G][1]')
MIN_SEQ MAX_SEQ
---------- ----------
1111 1115
一个稍微不同的公用表表达式,看看你的表中哪些不存在,我认为这就是你所追求的:
with t as (
select min(to_number(substr(ename, 5, 4))) as min_seq,
max(to_number(substr(ename, 5, 4))) as max_seq
from table1
where status = 2
and regexp_like(ename,
'emp[-][0-9]{4}[_][0-9]{2}[_][0-9]{2}[_][0-9]{2}[_][0-9]{4}[_][G][1]')
)
select min_seq + level - 1 as seq
from t
connect by level <= (max_seq - min_seq) + 1;
SEQ
----------
1111
1112
1113
1114
1115
或者如果您愿意:
with t as (
select to_number(substr(ename, 5, 4)) as seq
from table1
where status = 2
and regexp_like(ename,
'emp[-][0-9]{4}[_][0-9]{2}[_][0-9]{2}[_][0-9]{2}[_][0-9]{4}[_][G][1]')
),
u as (
select min(seq) as min_seq,
max(seq) as max_seq
from t
),
v as (
select min_seq + level - 1 as seq
from u
connect by level <= (max_seq - min_seq) + 1
)
select v.seq as missing_seq
from v
left join t on t.seq = v.seq
where t.seq is null
order by v.seq;
MISSING_SEQ
-----------
1112
1113
1114
基于注释,我认为您希望ID的其他元素(YY_MM_DD)的每个组合的序列缺失值。这将为您提供细分:
...
select v.seq as missing_seq
from v
where not exists (select 1 from t where t.seq = v.seq)
order by v.seq;
输出如:
with t as (
select to_number(substr(ename, 5, 4)) as seq,
substr(ename, 10, 2) as yy,
substr(ename, 13, 2) as mm,
substr(ename, 16, 2) as dd
from table1
where status = 2
and regexp_like(ename,
'emp[-][0-9]{4}[_][0-9]{2}[_][0-9]{2}[_][0-9]{2}[_][0-9]{4}[_][G][1]')
),
r (yy, mm, dd, seq, max_seq) as (
select yy, mm, dd, min(seq), max(seq)
from t
group by yy, mm, dd
union all
select yy, mm, dd, seq + 1, max_seq
from r
where seq + 1 <= max_seq
)
select yy, mm, dd, seq as missing_seq
from r
where not exists (
select 1 from t
where t.yy = r.yy
and t.mm = r.mm
and t.dd = r.dd
and t.seq = r.seq
)
order by yy, mm, dd, seq;
如果您要查找特定日期,请对其进行冷过滤(在YY MM DD MISSING_SEQ
---- ---- ---- -------------
14 01 01 1112
14 01 01 1113
14 01 01 1114
14 02 02 1118
14 02 02 1120
14 02 03 1127
14 02 03 1128
中,或t
中的第一个分支),但您也可以更改正则表达式模式以包含固定值;因此,要查找r
模式,例如14 06
。虽然这很难概括,但过滤器('emp[-][0-9]{4}_14_06_[0-9]{2}[_][0-9]{4}[_][G][1]'
可能更灵活。
如果你坚持在程序中使用它,你可以使日期元素可选并修改正则表达式模式:
where t.yy = '14' and t.mm = '06'
我不知道为什么你坚持要这样做或者为什么要使用create or replace procedure show_missing_seqs(yy in varchar2 default '[0-9]{2}',
mm in varchar2 default '[0-9]{2}', dd in varchar2 default '[0-9]{2}') as
pattern varchar2(80);
cursor cur (pattern varchar2) is
with t as (
select to_number(substr(ename, 5, 4)) as seq,
substr(ename, 10, 2) as yy,
substr(ename, 13, 2) as mm,
substr(ename, 16, 2) as dd
from table1
where status = 2
and regexp_like(ename, pattern)
),
r (yy, mm, dd, seq, max_seq) as (
select yy, mm, dd, min(seq), max(seq)
from t
group by yy, mm, dd
union all
select yy, mm, dd, seq + 1, max_seq
from r
where seq + 1 <= max_seq
)
select yy, mm, dd, seq as missing_seq
from r
where not exists (
select 1 from t
where t.yy = r.yy
and t.mm = r.mm
and t.dd = r.dd
and t.seq = r.seq
)
order by yy, mm, dd, seq;
begin
pattern := 'emp[-][0-9]{4}[_]'
|| yy || '[_]' || mm || '[_]' || dd
|| '[_][0-9]{4}[_][G][1]';
for rec in cur(pattern) loop
dbms_output.put_line(to_char(rec.missing_seq, 'FM0000'));
end loop;
end show_missing_seqs;
/
,因为你依赖于显示那个的客户端/来电者;你的工作会对输出做些什么?您可以将此返回dbms_output
,这将更灵活。但无论如何,您可以从SQL * Plus / SQL Developer中调用它:
sys_refcursor