这个文本的sql语句或程序的正则表达式?

时间:2014-03-19 21:46:31

标签: sql regex sqlplus

特别是文件中的sql文本。

   select substr(to_char(ctl.tody_run_dt,'YYYYMMDD'),1,8) tody_yyyymmdd,           
        substr(to_char(cdr.nxt_proc_dt,'YYYYMMDD'),1,8) nxt_proc_yyyymmdd,       
        add_months(ctl.tody_run_dt - cdr.past_accru_dys,                         
              - ct.nbr_cycl_for_adj) beg_proc_dt,                           
                from tbl_crd )                                                         
      and prv.calendar_run_dt =                                                    
       ( select max(calendar_run_dt)                                              
      from run_tbl1 prv2                                                     

从此我希望提取所有表格, 这通过正则表达式看起来相当复杂吗?有办法吗?或者我应该写一个程序?我只是想出一个算法。

2 个答案:

答案 0 :(得分:0)

您可以像这样执行线性搜索。它为你的例子放松了 并且只定位from关键字,排除其他关键字。

表格数据在第1组中捕获。必须在每个分割时分割 通过find循环匹配。

 #  from\s+((?!(?:select|from|where|and)\b)\w+(?:[,\s]+(?!(?:select|from|where|and)\b)\w+)*)

 from 
 \s+ 
 (                # (1 start), Contains all the table info
      (?!              # exclude keywords
           (?:
                select
             |  from
             |  where
             |  and 
           )
           \b 
      )
      \w+ 
      (?:
           [,\s]+          
           (?!              # exclude keywords
                (?:
                     select 
                  |  from
                  |  where
                  |  and 
                )
                \b 
           )
           \w+ 
      )*
 )                # (1 end)

Perl测试用例

$/ = undef;

$str = <DATA>;

while ( $str =~ /from\s+((?!(?:select|from|where|and)\b)\w+(?:[,\s]+(?!(?:select|from|where|and)\b)\w+)*)/g )
{
     print "\n'$1'";
}

__DATA__
   select substr(to_char(ctl.tody_run_dt,'YYYYMMDD'),1,8) tody_yyyymmdd,
        substr(to_char(cdr.nxt_proc_dt,'YYYYMMDD'),1,8) nxt_proc_yyyymmdd,
        add_months(ctl.tody_run_dt - cdr.past_accru_dys,
              - ct.nbr_cycl_for_adj) beg_proc_dt,
       (ctl.tody_run_dt + cdr.futr_accru_dys) end_proc_dt,
     ctl.tody_end_proc_dt,
   ctl.prv_end_proc_dt,
   cdr.fst_proc_dy,
   cdr.lst_proc_dy,
   cdr.accru_nbr_of_dys,
   cdr.dy_of_wk,
        from run_tbl1 cdr, runtbl
       run_tbl1 prv,
       run_tbl_cntl ctl,
       tbl_crd ct
       where cdr.calendar_run_dt = ctl.tody_run_dt
      and ct.nbr_cycl_for_adj =
      ( select max(nbr_cycl_for_adj)

       from tbl_crd )
      and prv.calendar_run_dt =
       ( select max(calendar_run_dt)
      from run_tbl1 prv2
      where prv2.calendar_run_dt < ctl.tody_run_dt
       and prv2.accru_nbr_of_dys = 1 )
       and rownum = 1

输出&gt;&gt;

'run_tbl1 cdr, runtbl
       run_tbl1 prv,
       run_tbl_cntl ctl,
       tbl_crd ct'
'tbl_crd'
'run_tbl1 prv2'

答案 1 :(得分:0)

首先,我必须更正SQL中的一些语法错误。

... cdr.dy_of_wk,  <<<< the comma is wrong
    from run_tbl1 cdr ...

这里是使用JSQLParser V0.8.9(https://github.com/JSQLParser/JSqlParser)提取表名的概念证明。

public static void main(String[] args) throws JSQLParserException {
    TablesNamesFinder tfinder = new TablesNamesFinder();
    String sql = "select substr(to_char(ctl.tody_run_dt,'YYYYMMDD'),1,8) tody_yyyymmdd, "
            + " substr(to_char(cdr.nxt_proc_dt,'YYYYMMDD'),1,8) nxt_proc_yyyymmdd, "
            + " add_months(ctl.tody_run_dt - cdr.past_accru_dys, "
            + " - ct.nbr_cycl_for_adj) beg_proc_dt, "
            + " (ctl.tody_run_dt + cdr.futr_accru_dys) end_proc_dt, "
            + " ctl.tody_end_proc_dt,  "
            + " ctl.prv_end_proc_dt,  "
            + " cdr.fst_proc_dy,  "
            + " cdr.lst_proc_dy,  "
            + " cdr.accru_nbr_of_dys,  "
            + " cdr.dy_of_wk  "
            + " from run_tbl1 cdr, runtbl,  "
            + " run_tbl1 prv,  "
            + " run_tbl_cntl ctl,  "
            + " tbl_crd ct  "
            + " where cdr.calendar_run_dt = ctl.tody_run_dt  "
            + " and ct.nbr_cycl_for_adj =  "
            + " ( select max(nbr_cycl_for_adj)  "
            + " from tbl_crd )  "
            + " and prv.calendar_run_dt =  "
            + " ( select max(calendar_run_dt)  "
            + " from run_tbl1 prv2  "
            + " where prv2.calendar_run_dt < ctl.tody_run_dt "
            + " and prv2.accru_nbr_of_dys = 1 )  "
            + " and rownum = 1 ";

    //parse SQL statement
    Select select = (Select) CCJSqlParserUtil.parse(sql);
    //extract table names
    List<String> tableList = tfinder.getTableList(select);

    System.out.println(tableList);
}

并输出

[run_tbl1, runtbl, run_tbl_cntl, tbl_crd]