使用正则表达式从String中删除其他所有内容

时间:2014-03-24 13:24:58

标签: regex groovy

我有这个正则表达式,这是String中允许的。

    ^\pL*['#-]?\pL*\$

要求已经改变,现在我需要从String中删除其他所有内容。例如,只允许其中一个特殊字符',#和 - 。

如何更改此正则表达式以删除其他不适合此内容的其他内容?

以下是预期值列表:

JohnO'Connell  -> allowed. should be as is.
JohnDias       -> allowed. should be as is.
JohnOConnell'  -> allowed. should be as is.
JohnOConnell#  -> allowed. should be as is.
JohnOConnell-  -> allowed. should be as is.
JohnOConnell-# -> should return JohnOConnell-
JohnOConn34ell -> should return JohnOConnell
*JohnOConnell  -> should return JohnOConnell
JohnOConnell*  -> should return JohnOConnell
JohnOConn$%ell -> should return JohnOConnell

由于

1 个答案:

答案 0 :(得分:0)

如果我理解正确,你可以这样做:

// Test data
def tests = [ [ input:"JohnO'Connell",   output:"JohnO'Connell" ],
              [ input:"JohnDias",        output:"JohnDias" ],
              [ input:"JohnOConnell'",   output:"JohnOConnell'" ],
              [ input:"JohnOConnell#",   output:"JohnOConnell#" ],
              [ input:"JohnOConnell-",   output:"JohnOConnell-" ],
              [ input:"JohnOConnell-#",  output:"JohnOConnell-" ],
              [ input:"JohnOConn34ell",  output:"JohnOConnell" ],
              [ input:"*JohnOConnell",   output:"JohnOConnell" ],
              [ input:"JohnOConnell*",   output:"JohnOConnell" ],
              [ input:"JohnOConn\$%ell", output:"JohnOConnell" ] ]

String normalize( String input ) {
    int idx = 0
    input.replaceAll( /[^A-Za-z'#\-]/, '' )  // remove all disallowed chars
         .replaceAll( /['#\-]/ ) { match ->  // replace 2nd+ instance of special chars
             idx++ == 0 ? match : ''
         }
}

tests.each {
    assert normalize( it.input ) == it.output
}