正则表达式无法在搜索文本c#中保留字符

时间:2013-12-03 13:42:36

标签: c# regex character reserved

我有一个正则表达式,它一直在满足我的所有要求,直到现在,我突然得到一个字符串,其中保留了c ++中的+和C#中的#。下面的代码适用于我的所有单词集合,除了c ++和C#

MatchCollection matches= Regex.Matches(@"This  program is written in C# We'll delete it after ten days", @"\bC\+\+\b");
foreach(Match m in matches)
{
      Console.Write(m.Value);
}

任何人都可以指出原因吗?

4 个答案:

答案 0 :(得分:3)

您应该在第二个边界使用\B而不是\b

MatchCollection matches= Regex.Matches(@"This  program is written in C# We'll delete it after ten days", @"\bC\#\B");

您可以阅读以下链接以获取更多信息: http://www.regular-expressions.info/wordboundaries.html

答案 1 :(得分:1)

您可以使用以下模式,该模式将匹配存储在组1

<强>模式

\bC(\+\+|\#)\s

这个C#代码:

<强> CODE

MatchCollection matches= Regex.Matches(@"This  program is written in C# We'll delete it after ten days", @"\bC\+\+\b");

foreach(Match m in matches)
{
     Console.Write(m.Groups[1].Value);
}

<强> INPUT

This  program is written in C# We'll delete it after ten days

<强>输出

C#

<强> INPUT

This  program is written in C++ We'll delete it after ten days

<强>输出

C++

答案 2 :(得分:0)

  

除了c ++和C#

之外,我的所有单词集合的代码工作

要使该匹配正常工作,您需要像@"(?:C\+\+)|(?:C#)"这样的正则表达式,这里是Regex 101 to prove it

答案 3 :(得分:0)

在您的情况下,不是寻找字边界\b)或非字边界\B),而是可能会考虑寻找空白\s+),行的开头^)和行的结尾$)。

这是一个可以做到这一点的正则表达式:

(?:^|\s+)(C#|C\+\+)(?=\s+|$)

这是一个 Perl 程序,用于演示样本数据集上的正则表达式。 (另请参阅live demo。)

#!/usr/bin/perl -w

use strict;
use warnings;

while (<DATA>) {
    chomp;

#   A - Preceded by the beginning of the line or 1 or more whitespace
#       characters
#   B - The character sequences 'C#' or 'C++'
#   C - Followed by 1 or more whitespace characters or the end of line.

    if (/(?:^|\s+)(C#|C\+\+)(?=\s+|$)/) {
#           ^^^^^  ^^^^^^^^    ^^^^^
#             A        B         C

        print "[$1] [$_]\n";
    } else {
        print "[--] [$_]\n";
    }
}

__END__
This program is written in C++ We'll delete it after ten days
This program is written in !C++ We'll delete it after ten days
This program is written in C++! We'll delete it after ten days
This program is written in C# We'll delete it after ten days
C# is the language this program is written in.
 C# is the language this program is written in.
C++ is the language this program is written in.
This program is written in C#
This program is written in C++
This program is written in C++!

预期输出:

[C++] [This program is written in C++ We'll delete it after ten days]
[--] [This program is written in !C++ We'll delete it after ten days]
[--] [This program is written in C++! We'll delete it after ten days]
[C#] [This program is written in C# We'll delete it after ten days]
[C#] [C# is the language this program is written in.]
[C#] [ C# is the language this program is written in.]
[C++] [C++ is the language this program is written in.]
[C#] [This program is written in C#]
[C++] [This program is written in C++]
[--] [This program is written in C++!]