Question

生成表的file_name字段中的数据应该是指定的编号，然后是_01，_02或_03等，然后是.pdf（示例82617_01.pdf）。

在某个地方，程序在指定的号码和01,02等之间放置一个州名，有时还有日期/时间戳。（82617_ALABAMA_01.pdf或19998_MAINE_07-31-2010_11-05-59_AM.pdf或5485325_OREGON_01例如.pdf）。

我们想开发一个SQL语句来查找错误的文件名并修复它们。理论上，找到包含varchar2数据类型的文件名并将其删除似乎相当简单，但将语句放在一起超出了我的范围。

任何帮助或建议表示赞赏。

类似的东西：

UPDATE GENERATION

SET FILE_NAME (?)

WHERE FILE_NAME (?...LIKE '%STRING%');?

Answer 1

您可以找到如下问题行：

select *
from Files
where length(FILE_NAME) - length(replace(FILE_NAME, '_', '')) > 1

您可以像这样修理它们：

update Files
set FILE_NAME = SUBSTR(FILE_NAME, 1, instr(FILE_NAME, '_') -1) ||
    SUBSTR(FILE_NAME, instr(FILE_NAME, '_', 1, 2))
where length(FILE_NAME) - length(replace(FILE_NAME, '_', '')) > 1

SQL Fiddle Example

Answer 2

您还可以使用Regexp_replace功能：

SQL> with t1(col) as(
  2      select '82617_mm_01.pdf' from dual union all
  3      select '456546_khkjh_89kjh_67_01.pdf' from dual union all
  4      select '19998_MAINE_07-31-2010_11-05-59_AM.pdf' from dual union all
  5      select '5485325_OREGON_01.pdf' from dual
  6     )
  7   select col
  8        , regexp_replace(col, '^([0-9]+)_(.*)_(\d{2}\.pdf)$', '\1_\3') res
  9     from t1;

COL                                     RES
 -------------------------------------- -----------------------------------------
82617_mm_01.pdf                         82617_01.pdf
456546_khkjh_89kjh_67_01.pdf            456546_01.pdf
19998_MAINE_07-31-2010_11-05-59_AM.pdf  19998_MAINE_07-31-2010_11-05-59_AM.pdf
5485325_OREGON_01.pdf                   5485325_01.pdf

要显示好的或坏的数据regexp_like功能将派上用场：

SQL> with t1(col) as(
  2      select '826170_01.pdf' from dual union all
  3      select '456546_01.pdf' from dual union all
  4      select '19998_MAINE_07-31-2010_11-05-59_AM.pdf' from dual union all
  5      select '5485325_OREGON_01.pdf' from dual
  6     )
  7   select col bad_data
  8     from t1
  9   where not regexp_like(col, '^[0-9]+_\d{2}\.pdf$');

BAD_DATA
--------------------------------------
19998_MAINE_07-31-2010_11-05-59_AM.pdf
5485325_OREGON_01.pdf

SQL> with t1(col) as(
  2      select '826170_01.pdf' from dual union all
  3      select '456546_01.pdf' from dual union all
  4      select '19998_MAINE_07-31-2010_11-05-59_AM.pdf' from dual union all
  5      select '5485325_OREGON_01.pdf' from dual
  6     )
  7   select col good_data
  8     from t1
  9   where regexp_like(col, '^[0-9]+_\d{2}\.pdf$');

GOOD_DATA
--------------------------------------
826170_01.pdf
456546_01.pdf

为此，您的update语句可能如下所示：

update your_table 
   set col = regexp_replace(col, '^([0-9]+)_(.*)_(\d{2}\.pdf)$', '\1_\3');
 --where clause if needed

从表数据值的中间删除varchar2字符串

2 个答案: