替换/删除与正则表达式(.NET)不匹配的字符

时间:2011-05-27 15:28:02

标签: c# regex replace remove-if

我有一个正则表达式来验证字符串。但现在我想删除所有与我的正则表达式不匹配的字符。

E.g。

regExpression = @"^([\w\'\-\+])"

text = "This is a sample text with some invalid characters -+%&()=?";

//Remove characters that do not match regExp.

result = "This is a sample text with some invalid characters -+";

有关如何使用RegExpression确定有效字符并删除所有其他字符的任何想法。

非常感谢

3 个答案:

答案 0 :(得分:13)

我相信你可以在一行中做到这一点(将白色字符列入白名单并替换其他所有内容):

var result = Regex.Replace(text, @"[^\w\s\-\+]", "");

从技术上讲,它会产生这个: “这是一个带有一些无效字符的示例文本 - +” 这与你的例子略有不同( - 和+之间的额外空格。)

答案 1 :(得分:11)

简单:

var match = Regex.Match(text, regExpression);
string result = "";
if(match.Success)
    result = match.Value;

删除不匹配的字符与保留匹配的字符相同。这就是我们在这里所做的。

如果表达式可能在文本中多次匹配,则可以使用:

var result = Regex.Matches(text, regExpression).Cast<Match>()
                  .Aggregate("", (s, e) => s + e.Value, s => s);

答案 2 :(得分:1)

感谢Replace chars if not match回答我已创建a helper method to strips unaccepted characters

允许的模式应该是Regex格式,期望它们用方括号括起来。打开squere支架后,功能将插入波浪号。 我预计它不能用于描述有效字符集的所有RegEx,但它适用于我们正在使用的相对简单的集合。

 /// <summary>
               /// Replaces  not expected characters.
               /// </summary>
               /// <param name="text"> The text.</param>
               /// <param name="allowedPattern"> The allowed pattern in Regex format, expect them wrapped in brackets</param>
               /// <param name="replacement"> The replacement.</param>
               /// <returns></returns>
               /// //        https://stackoverflow.com/questions/4460290/replace-chars-if-not-match.
               //https://stackoverflow.com/questions/6154426/replace-remove-characters-that-do-not-match-the-regular-expression-net
               //[^ ] at the start of a character class negates it - it matches characters not in the class.
               //Replace/Remove characters that do not match the Regular Expression
               static public string ReplaceNotExpectedCharacters( this string text, string allowedPattern,string replacement )
              {
                     allowedPattern = allowedPattern.StripBrackets( "[", "]" );
                      //[^ ] at the start of a character class negates it - it matches characters not in the class.
                      var result = Regex .Replace(text, @"[^" + allowedPattern + "]", replacement);
                      return result;
              }

static public string RemoveNonAlphanumericCharacters( this string text)
              {
                      var result = text.ReplaceNotExpectedCharacters(NonAlphaNumericCharacters, "" );
                      return result;
              }
        public const string NonAlphaNumericCharacters = "[a-zA-Z0-9]";