Perl正则表达式捕获锚字之间的字符串

时间:2012-05-08 12:00:22

标签: perl

我仍在努力清理Oracle文件,必须替换文件中的字符串,其中Oracle模式名称前置于文件中的函数/过程/包名称,以及函数/过程/包名称是双引号。一旦定义得到纠正,我就会将更正写回文件,以及其余的实际代码。

我编写了代码来替换简单的声明(没有输入/输出参数)现在我试图让我的正则表达式进行操作(注意:这篇文章是this question的延续)我的一些例子我试图清理:

替换:

CREATE OR REPLACE FUNCTION "TRON2000"."DC_F_DUMP_CSV_MMA" (
p_trailing_separator IN BOOLEAN DEFAULT FALSE,
p_max_linesize IN NUMBER DEFAULT 32000,
p_mode IN VARCHAR2 DEFAULT 'w'
)
RETURN NUMBER
IS

CREATE OR REPLACE FUNCTION DC_F_DUMP_CSV_MMA (
p_trailing_separator IN BOOLEAN DEFAULT FALSE,
p_max_linesize IN NUMBER DEFAULT 32000,
p_mode IN VARCHAR2 DEFAULT 'w'
)
RETURN NUMBER
IS

我一直在尝试使用以下正则表达式来分隔声明,以便在我清除模式名称之后进行重建/修复函数/过程/包的名称不要双引号。我正在努力让每个人都进入一个缓冲区 - 这是我最近尝试将所有中间输入/输出都集中到它自己的缓冲区中:

\b(CREATE\sOR\sREPLACE\s(PACKAGE|PACKAGE\sBODY|PROCEDURE|FUNCTION))(?:\W+\w+){1,100}?\W+(RETURN)\s*(\W+\w+)\s(AS|IS)\b

非常感谢任何/所有帮助!

这是我现在用来评估/编写更正文件的脚本:

#!/usr/bin/perl
use strict;
use warnings;
use File::Find;
use Data::Dumper;

# utility to clean strings
sub trim($) {
    my $string = shift;
    $string = "" if !defined($string);

    $string =~ s/^\s+//;
    $string =~ s/\s+$//;

    # aggressive removal of blank lines
    $string =~ s/\n+/\n/g;
    return $string;
}

sub cleanup_packages {
    my $file = shift;
    my $tmp  = $file . ".tmp";
    my $package_name;

    open( OLD, "< $file" ) or die "open $file: $!";
    open( NEW, "> $tmp" )  or die "open $tmp: $!";

    while ( my $line = <OLD> ) {

  # look for the first line of the file to contain a CREATE OR REPLACE STATEMENT
        if ( $line =~
m/^(CREATE\sOR\sREPLACE)\s*(PACKAGE|PACKAGE\sBODY)?\s(.+)\s(AS|IS)?/i
          )
        {

            # look ahead to next line, in case the AS/IS is next
            my $nextline = <OLD>;

            # from the above IF clause, the package name is in buffer 3
            $package_name = $3;

             # if the package name and the AS/IS is on the same line, and
             # the package name is quoted/prepended by the TRON2000 schema name
            if ( $package_name =~ m/"TRON2000"\."(\w+)"(\s*|\S*)(AS|IS)/i ) {
                # grab just the name and the AS/IS parts
                $package_name =~ s/"TRON2000"\."(\w+)"(\s*|\S*)(AS|IS)/$1 $2/i;
                trim($package_name);
            }
            elsif (    ( $package_name =~ m/"TRON2000"\."(\w+)"/i )
                    && ( $nextline =~ m/(AS|IS)/ ) )
            {

# if the AS/IS was on the next line from the name, put them together on one line
                $package_name =~ s/"TRON2000"\."(\w+)"(\s*|\S*)/$1/i;
                $package_name = trim($package_name) . ' ' . trim($nextline);
                trim($package_name);    # remove trailing carriage return
            }

            # now put the line back together
            $line =~
s/^(CREATE\sOR\sREPLACE)\s*(PACKAGE|PACKAGE\sBODY|FUNCTION|PROCEDURE)?\s(.+)\s(AS|IS)?/$1 $2 $package_name/ig;

            # and print it to the file
            print NEW "$line\n";
        }
        else {

            # just a normal line - print it to the temp file
            print NEW $line or die "print $tmp: $!";
        }
    }

    # close up the files
    close(OLD) or die "close $file: $!";
    close(NEW) or die "close $tmp: $!";

    # rename the temp file as the original file name
    unlink($file) or die "unlink $file: $!";
    rename( $tmp, $file ) or die "can't rename $tmp to $file: $!";
}

# find and clean up oracle files
sub eachFile {
    my $ext;
    my $filename = $_;
    my $fullpath = $File::Find::name;

    if ( -f $filename ) {
        ($ext) = $filename =~ /(\.[^.]+)$/;
    }
    else {

        # ignore non files
        return;
    }

    if ( $ext =~ /(\.spp|\.sps|\.spb|\.sf|\.sp)/i ) {
        print "package: $filename\n";
        cleanup_packages($fullpath);
    }
    else {
        print "$filename not specified for processing!\n";
    }
}

MAIN:
{
    my ( @files, $file );
    my $dir = 'C:/1_atest';

    # grab all the files for cleanup
    find( \&eachFile, "$dir/" );

    #open and evaluate each
    foreach $file (@files)
    {
        # skip . and ..
        next   if ( $file =~ /^\.$/ );
        next if ( $file =~ /^\.\.$/ );
          cleanup_file($file);
      };
}

1 个答案:

答案 0 :(得分:3)

假设文件的整个内容在var中存储为标量,以下应该可以解决问题。

$Str = '
CREATE OR REPLACE FUNCTION "TRON2000"."DC_F_DUMP_CSV_MMA" (
    p_trailing_separator IN BOOLEAN DEFAULT FALSE,
    p_max_linesize IN NUMBER DEFAULT 32000,
    p_mode IN VARCHAR2 DEFAULT w
)
RETURN NUMBER
IS

CREATE OR REPLACE FUNCTION "TRON2000"."DC_F_DUMP_CSV_MMA" (
    p_trailing_separator IN BOOLEAN DEFAULT FALSE,
    p_max_linesize IN NUMBER DEFAULT 32000,
    p_mode IN VARCHAR2 DEFAULT w
)
RETURN NUMBER
IS
';

$Str =~ s#^(create\s+(?:or\s+replace\s+)?\w+\s+)"[^"]+"."([^"]+)"#$1 $2#mig;

print $Str;