正则表达式输出不正确

时间:2014-07-06 05:56:45

标签: regex perl parsing

我正在尝试解析对SYBASE服务器运行prsqlcache命令的输出并将输出信息存储到表中。每个列信息都存储为一个标量并保存到一个数组中,然后将整个数组BCP放入目标表中。

作为示例,我给出了缓存语句的两个示例输出。在第一个语句高速缓存信息中,SQL文本包含我试图提取的列的名称,因此当我使用我编写的代码时,它会因为SSQL_DESC,ssql_name等出现多次而中断。

最简单的解决方案是删除包含我想要查找和删除的SQL Text的行,以为我不需要这些信息。但事实证明我也需要它。有没有办法让我的现有逻辑工作?

输入文件示例

Start of SSQL Hash Table at 0x0x2aacbdfdf050
Memory configured: 1024000 2k pages Memory used: 109219 2k pages
Bucket# 000 address 0x0x2aacbdfdf050
SSQL_DESC 0x0x2aad268cb8b0
ssql_name *ss1530075878_1111638016ss*
ssql_hashkey 0x0x42424000   ssql_id 1530075878
ssql_suid 31838     ssql_uid 1063880    ssql_dbid 14    ssql_spid 0
ssql_status 0x0x81  ssql_parallel_deg 1
ssql_isolate 1      ssql_tranmode 32
ssql_keep 0     ssql_usecnt 1   ssql_pgcount 20
ssql_optgoal allrows_oltp   ssql_optlevel ase_default
opt options bitmap  00809f172c6181fffb160500008009000000000000000000000000000000
SQL TEXT: select SSQL_DESC = 'addadads',ssql_name='aasass', ssql_hashkey='ssdddssddcs', ssql_id =1345, ssql_suid =4344, ssql_uid =2344, ssql_dbid=11, ssql_spid=0,ssql_status=0x024, ssql_parallel_deg=1, ssql_isolate = 1, ssql_tranmode = 32, ssql_keep = 1, ssql_usecnt =9, ssql_pgcount =8, ssql_optgoal = 'allrows', ssql_optlevel ='wee', opt = 'options', bitmap = '1235ddf3445553334' from table1
SSQL_DESC 0x0x2aad268cb8b0
ssql_name *ss1530075878_1111638016ss*
ssql_hashkey 0x0x433424030  ssql_id 1443475244
ssql_suid 553   ssql_uid 1443   ssql_dbid 15    ssql_spid 1
ssql_status 0x0x22  ssql_parallel_deg 1
ssql_isolate 1      ssql_tranmode 62
ssql_keep 0     ssql_usecnt 1   ssql_pgcount 22
ssql_optgoal allrows_oltp   ssql_optlevel ase_default
opt options bitmap  00809f172c6181fffb160500008009000000000000000000000000000000
SQL TEXT: select column from table

代码段

foreach my $line (@file){
    #print $line;
    my $string = "SSQL_DESC";
    my $string1 = "ssql_name";
    my $string2 ="ssql_hashkey";
    my $string3 = "ssql_suid";
    my $string4 = "ssql_status";
    my $string5 = "ssql_isolate";
    my $string6 = "ssql_keep";
    my $string7 = "ssql_optgoal";
    my $string8 = "bitmap";

    if ($line =~ /$string/i) { 
        my @sentence = split ' ', $line;
        $sql_desc = $sentence[1];
    }

    if ($line =~ /$string1/i) { 
        my @sentence = split ' ', $line;
        $sql_name = $sentence[1];
    }

    if ($line =~ /$string2/i) { 
        my @sentence = split ' ', $line;
        $sql_hashkey = $sentence[1];
        $ssql_id = $sentence[3];

        print Dumper \@sentence;
    }

    if($line =~ /$string3/i){
        my @sentence = split ' ', $line;
        $ssql_suid = $sentence[1];
        $ssql_uid = $sentence[3];
        $ssql_dbid = $sentence[5];
        $ssql_spid = $sentence[7];
    }

    if($line =~ /$string4/i){
        my @sentence = split ' ', $line;
        $ssql_status = $sentence[1];
        $ssql_parallel_deg = $sentence[3];
    }

    if($line =~ /$string5/i){
        my @sentence = split ' ', $line;
        $ssql_isolate = $sentence[1];
        $ssql_tranmode = $sentence[3];
    }

    if($line =~ /$string6/i){
        my @sentence = split ' ', $line;
        $ssql_keep = $sentence[1];
        $ssql_usecnt = $sentence[3];
        $ssql_pgcount = $sentence[5];
    }

    if($line =~ /$string7/i){
        my @sentence = split ' ', $line;
        $ssql_optgoal = $sentence[1];
        $ssql_optlevel = $sentence[3];
    }

    if ($line =~ /$string8/i) {
        my @sentence = split ' ', $line;
        $ssql_opt = $sentence[1];
        $bitmap = $sentence[3];

        @array = ($sql_desc, $sql_name, $sql_hashkey, $ssql_id, $ssql_suid, $ssql_uid, $ssql_dbid, $ssql_spid, $ssql_status, $ssql_parallel_deg, $ssql_isolate, $ssql_tranmode, $ssql_keep, $ssql_usecnt, $ssql_pgcount, $ssql_optgoal, $ssql_optlevel, $ssql_opt, $bitmap);
        #print Dumper \@array;
    }

}

输出:

$VAR1 = [
          'ssql_hashkey',
          '0x0x42424000',
          'ssql_id',
          '1530075878'
        ];
$VAR1 = [
          'SQL',
          'TEXT:',
          'select',
          'SSQL_DESC',
          '=',
          '\'addadads\',ssql_name=\'aasass\',',
          'ssql_hashkey=\'ssdddssddcs\',',
          'ssql_id',
          '=1345,',
          'ssql_suid',
          '=4344,',
          'ssql_uid',
          '=2344,',
          'ssql_dbid=11,',
          'ssql_spid=0,ssql_status=0x024,',
          'ssql_parallel_deg=1,',
          'ssql_isolate',
          '=',
          '1,',
          'ssql_tranmode',
          '=',
          '32,',
          'ssql_keep',
          '=',
          '1,',
          'ssql_usecnt',
          '=9,',
          'ssql_pgcount',
          '=8,',
          'ssql_optgoal',
          '=',
          '\'allrows\',',
          'ssql_optlevel',
          '=\'wee\',',
          'opt',
          '=',
          '\'options\',',
          'bitmap',
          '=',
          '\'1235ddf3445553334\'',
          'from',
          'table1'
        ];
$VAR1 = [
          'ssql_hashkey',
          '0x0x433424030',
          'ssql_id',
          '1443475244'
        ];

1 个答案:

答案 0 :(得分:1)

每当你发现自己编写许多标量声明都要以类似的方式使用时,你应该考虑直接使用 hash 。这同样适用于名称相同的变量序列,除了末尾的索引号:那些应该是数组

解决这个问题最好的方法是将SQL TEXT行视为一种特殊情况。它也很有用,因为它始终是每个块的最后一行,因此可以作为转储到目前为止发现的数据的触发器。

我使用数组@fields来包含要提取的所有字段的名称。通过join使用管道|交替运算符来导出与任何字段名称匹配的正则表达式很简单。

此后,只需要在文件的每一行中找到所有出现的任何字段名称,并提取以下数据字段。所有这些都存储在哈希%data中。

我将ssql_opt硬编码为"bitmap",因为它似乎总是相同的。如果这是错误的,那么你必须解释如何从文件中获取其值。我怀疑,事实上,可能会有opt个值,您将不得不重新考虑如何表示这些值。

我还没有重建你的最终@array,因为它不清楚它只不过是一个调试工件。如果您需要它,那只是my @array = @data{@fields}

use strict;
use warnings;
use 5.010;

use Data::Dump;

my @fields = qw/
  SSQL_DESC
  ssql_name
  ssql_hashkey      ssql_id
  ssql_suid         ssql_uid          ssql_dbid         ssql_spid
  ssql_status       ssql_parallel_deg
  ssql_isolate      ssql_tranmode
  ssql_keep         ssql_usecnt       ssql_pgcount 
  ssql_optgoal      ssql_optlevel 
  bitmap
/;

my $re = join '|', @fields;
my %data;

while (my $line = <DATA>) {

  if ( $line =~ /^(SQL TEXT):\s*(.*)/ ) {

    $data{$1} = $2;

    $data{ssql_opt} = "bitmap";
    printf "%-20s => %s\n", $_, $data{$_} // '<undef>' for @fields;
    print "$1: $2\n";
    print "\n";

    %data = ();
  }
  else {
    $data{$1} = $2 while $line =~ /\b($re)\s+(\S+)/og;
  }
}

__DATA__
Start of SSQL Hash Table at 0x0x2aacbdfdf050
Memory configured: 1024000 2k pages Memory used: 109219 2k pages
Bucket# 000 address 0x0x2aacbdfdf050

SSQL_DESC 0x0x2aad268cb8b0
ssql_name *ss1530075878_1111638016ss*
ssql_hashkey 0x0x42424000   ssql_id 1530075878
ssql_suid 31838     ssql_uid 1063880    ssql_dbid 14    ssql_spid 0
ssql_status 0x0x81  ssql_parallel_deg 1
ssql_isolate 1      ssql_tranmode 32
ssql_keep 0     ssql_usecnt 1   ssql_pgcount 20
ssql_optgoal allrows_oltp   ssql_optlevel ase_default
opt options bitmap  00809f172c6181fffb160500008009000000000000000000000000000000
SQL TEXT: select SSQL_DESC = 'addadads',ssql_name='aasass', ssql_hashkey='ssdddssddcs', ssql_id =1345, ssql_suid =4344, ssql_uid =2344, ssql_dbid=11, ssql_spid=0,ssql_status=0x024, ssql_parallel_deg=1, ssql_isolate = 1, ssql_tranmode = 32, ssql_keep = 1, ssql_usecnt =9, ssql_pgcount =8, ssql_optgoal = 'allrows', ssql_optlevel ='wee', opt = 'options', bitmap = '1235ddf3445553334' from table1

SSQL_DESC 0x0x2aad268cb8b0
ssql_name *ss1530075878_1111638016ss*
ssql_hashkey 0x0x433424030  ssql_id 1443475244
ssql_suid 553   ssql_uid 1443   ssql_dbid 15    ssql_spid 1
ssql_status 0x0x22  ssql_parallel_deg 1
ssql_isolate 1      ssql_tranmode 62
ssql_keep 0     ssql_usecnt 1   ssql_pgcount 22
ssql_optgoal allrows_oltp   ssql_optlevel ase_default
opt options bitmap  00809f172c6181fffb160500008009000000000000000000000000000000
SQL TEXT: select column from table

<强>输出

SSQL_DESC            => 0x0x2aad268cb8b0
ssql_name            => *ss1530075878_1111638016ss*
ssql_hashkey         => 0x0x42424000
ssql_id              => 1530075878
ssql_suid            => 31838
ssql_uid             => 1063880
ssql_dbid            => 14
ssql_spid            => 0
ssql_status          => 0x0x81
ssql_parallel_deg    => 1
ssql_isolate         => 1
ssql_tranmode        => 32
ssql_keep            => 0
ssql_usecnt          => 1
ssql_pgcount         => 20
ssql_optgoal         => allrows_oltp
ssql_optlevel        => ase_default
ssql_opt             => bitmap
bitmap               => 00809f172c6181fffb160500008009000000000000000000000000000000
SQL TEXT: select SSQL_DESC = 'addadads',ssql_name='aasass', ssql_hashkey='ssdddssddcs', ssql_id =1345, ssql_suid =4344, ssql_uid =2344, ssql_dbid=11, ssql_spid=0,ssql_status=0x024, ssql_parallel_deg=1, ssql_isolate = 1, ssql_tranmode = 32, ssql_keep = 1, ssql_usecnt =9, ssql_pgcount =8, ssql_optgoal = 'allrows', ssql_optlevel ='wee', opt = 'options', bitmap = '1235ddf3445553334' from table1

SSQL_DESC            => 0x0x2aad268cb8b0
ssql_name            => *ss1530075878_1111638016ss*
ssql_hashkey         => 0x0x433424030
ssql_id              => 1443475244
ssql_suid            => 553
ssql_uid             => 1443
ssql_dbid            => 15
ssql_spid            => 1
ssql_status          => 0x0x22
ssql_parallel_deg    => 1
ssql_isolate         => 1
ssql_tranmode        => 62
ssql_keep            => 0
ssql_usecnt          => 1
ssql_pgcount         => 22
ssql_optgoal         => allrows_oltp
ssql_optlevel        => ase_default
ssql_opt             => bitmap
bitmap               => 00809f172c6181fffb160500008009000000000000000000000000000000
SQL TEXT: select column from table