如何更改字符串的格式/模式?

时间:2016-05-18 18:03:28

标签: java sql regex eclipse

我有一个类似下面的字符串,我想写一些东西来检查遵循以下格式。

Insert into TABLE(A, B, C, D, E, F, G, H) values (123, 'VALUE ' , ' ', ' ', 'XXX', 'CCC', ' ',  ' ');

这将是一个有错误的。

Insert into TABLE(A, B, C, D, E, F, G, H) values (123, ''VALUE ''' , '', '', 'RED', 'FAX', '',  '');

正如你所看到的,第二个有额外的逗号或额外的引用。 (基本上检查值括号内的所有内容,插入永远不会更改。)

我想检查错误的模式并动态编辑它。有什么想法吗?

1 个答案:

答案 0 :(得分:1)

描述

^[^(]*table\([^)]*\)[^(]*?Values\s*
(\((?:(?:[^';\n\r]*|'[^']*')\s*(?:(?=\))|,)\s*)*\))

Regular expression visualization

此正则表达式将执行以下操作:

  • 查找以Insert into TABLE( .... ) values
  • 结构开头的行
  • 验证values部分是否包含逗号分隔的值列表
    • 字符串未用123
    • 等引号括起来
    • 'red'
    • 等单引号括起来的字符串
  • 允许空格包围任何分隔逗号
  • 允许引用的字符串包含逗号'This, value has a comma'

备注:

  • 如果正则表达式与您的字符串不匹配,则源字符串存在问题
  • 我建议使用以下标志:Case insenstive。

实施例

实例

在这个例子中,我使用多行,全局和忽略空格选项。为了更好地说明它的运作方式。

https://regex101.com/r/iW9gI7/1

源字符串

此处的最后一行无效

Insert into TABLE(A, B, C, D, E, F, G, H) values (123);
Insert into TABLE(A, B, C, D, E, F, G, H) values (123, 'VALUE ', ' ', ' ', 'XXX', 'CCC', ' ', ' ');
Insert into TABLE(A, B, C, D, E, F, G, H) values (123, 'VAL,UE ' , ' ', ' ', 'XXX', 'CCC', ' ',  ' ');
Insert into TABLE(A, B, C, D, E, F, G, H) values (123, ''VALUE ''' , '', '', 'RED', 'FAX', '',  '')

有效匹配

Insert into TABLE(A, B, C, D, E, F, G, H) values (123)
Insert into TABLE(A, B, C, D, E, F, G, H) values (123, 'VALUE ', ' ', ' ', 'XXX', 'CCC', ' ', ' ')
Insert into TABLE(A, B, C, D, E, F, G, H) values (123, 'VAL,UE ' , ' ', ' ', 'XXX', 'CCC', ' ',  ' ')

解释

NODE                     EXPLANATION
----------------------------------------------------------------------
  ^                        the beginning of a "line"
----------------------------------------------------------------------
  [^(]*                    any character except: '(' (0 or more times
                           (matching the most amount possible))
----------------------------------------------------------------------
  table                    'table'
----------------------------------------------------------------------
  \(                       '('
----------------------------------------------------------------------
  [^)]*                    any character except: ')' (0 or more times
                           (matching the most amount possible))
----------------------------------------------------------------------
  \)                       ')'
----------------------------------------------------------------------
  [^(]*?                   any character except: '(' (0 or more times
                           (matching the least amount possible))
----------------------------------------------------------------------
  Values                   'Values'
----------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    \(                       '('
----------------------------------------------------------------------
    (?:                      group, but do not capture (0 or more
                             times (matching the most amount
                             possible)):
----------------------------------------------------------------------
      (?:                      group, but do not capture:
----------------------------------------------------------------------
        [^';\n\r]*               any character except: ''', ';', '\n'
                                 (newline), '\r' (carriage return) (0
                                 or more times (matching the most
                                 amount possible))
----------------------------------------------------------------------
       |                        OR
----------------------------------------------------------------------
        '                        '\''
----------------------------------------------------------------------
        [^']*                    any character except: ''' (0 or more
                                 times (matching the most amount
                                 possible))
----------------------------------------------------------------------
        '                        '\''
----------------------------------------------------------------------
      )                        end of grouping
----------------------------------------------------------------------
      \s*                      whitespace (\n, \r, \t, \f, and " ")
                               (0 or more times (matching the most
                               amount possible))
----------------------------------------------------------------------
      (?:                      group, but do not capture:
----------------------------------------------------------------------
        (?=                      look ahead to see if there is:
----------------------------------------------------------------------
          \)                       ')'
----------------------------------------------------------------------
        )                        end of look-ahead
----------------------------------------------------------------------
       |                        OR
----------------------------------------------------------------------
        ,                        ','
----------------------------------------------------------------------
      )                        end of grouping
----------------------------------------------------------------------
      \s*                      whitespace (\n, \r, \t, \f, and " ")
                               (0 or more times (matching the most
                               amount possible))
----------------------------------------------------------------------
    )*                       end of grouping
----------------------------------------------------------------------
    \)                       ')'
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------