使用Perl regex从多个字符串中提取表名

时间:2017-07-13 10:49:08

标签: regex perl

我想从这个部分SQL查询中提取所有表名。请注意,我正在逐行阅读,而不是一个字符串:

from DS_RAN_CORID t1, NAMON_GNS t4, NA_VAL_ROL t6, A_TI_G_V t7, PTSM_TCR t2
left outer join T_TR_COR_LAG t3
on  t2.inp_seq = t3.inp_seq and t3.ti_number = t2.ti_number
left outer join OUT_TR_COR t5
on  t2.inp_seq=t5.inp_seq and t5.ti_number=t2.ti_number
where t1.inp_seq = t2.inp_seq and t2.ti_number = t6.interval_number and
      t1.ti_grp = t7.dm_group and t2.ti_number = t7.interval_number;

我需要提取的表格:DS_RAN_CORID/ NAMON_GNS/ NA_VAL_ROL/ A_TI_G_V/ PTSM_TCR/ T_TR_COR_LAG/ OUT_TR_COR/

我试过了:

  1. t1.t2.等与任何字母和任何数字匹配:

    $string=~m/(\S).\d/gi;

  2. 假设我的代码是正确的,我需要将t1.TABLE_NAME t1进行比较,并使用以下内容提取表名:

    $string=~m/\w+\s+(S)\d/gi;

1 个答案:

答案 0 :(得分:1)

我认为SQL::Parser可能会有所帮助,但它会对SQL产生影响。我将此作为参考点离开:

#!/usr/bin/env perl

use strict;
use warnings;

use SQL::Parser;

my $sql = <<SQL;
select *
from DS_RAN_CORID t1, NAMON_GNS t4, NA_VAL_ROL t6, A_TI_G_V t7, PTSM_TCR t2
left outer join T_TR_COR_LAG t3
on  t2.inp_seq = t3.inp_seq and t3.ti_number = t2.ti_number
left outer join OUT_TR_COR t5
on  t2.inp_seq=t5.inp_seq and t5.ti_number=t2.ti_number
where t1.inp_seq = t2.inp_seq and t2.ti_number = t6.interval_number and
      t1.ti_grp = t7.dm_group and t2.ti_number = t7.interval_number;
SQL

my $parser = SQL::Parser->new;
$parser->dialect('MySQL');

die unless $parser->parse( $sql );

print "$_\n" for @{ $parser->structure->{table_names} };

至于使用正则表达式,我要注意所有表名似乎都由大写ASCII和下划线组成:

my (%tables) = reverse ($sql =~ /([A-Z][A-Z_]+) \s+ (t[1-9])/gx);
print Dump \%tables;
---
t1: DS_RAN_CORID
t2: PTSM_TCR
t3: T_TR_COR_LAG
t4: NAMON_GNS
t5: OUT_TR_COR
t6: NA_VAL_ROL
t7: A_TI_G_V