Question

我正在使用正则表达式，我写了这个：

 static void Main(string[] args)
    {
        string test = "this a string meant to test long space recognition      n       a";
        Regex regex = new Regex(@"[a-z][\s]{4,}[a-z]$");
        MatchCollection matches = regex.Matches(test);
        if (matches.Count > 1)
            Console.WriteLine("yes");
        else
        {
            Console.WriteLine("no");
            Console.WriteLine("the number of matches is "+matches.Count);
        }
    }

在我看来，Matches方法应该找到“n n”和“n a”。然而，它只设法找到“n n”，我只是不明白为什么会这样......

Answer 1

正则表达式中的$表示模式必须出现在行尾。如果你想找到所有长空格，这个简单的表达式就足够了：

\s{4,}

如果你真的需要知道空格是否被a-z包围，你可以像这样搜索

(?<=[a-z])\s{4,}(?=[a-z])

这使用模式......

(?<=prefix)find(?=suffix)

...并找到前缀和后缀之间的位置。前缀和后缀不是匹配的一部分，即match.Value仅包含连续的空格。因此，您无法获得＆＃34; n＆＃34;消费问题由Jon Skeet提及。

Answer 2

你有两个问题：

1）您将匹配锚定到字符串的末尾。实际上，匹配的值是“n ... a”，而不是“n ... n”

2）中间的“n”被第一场比赛消耗，因此不能成为第二场比赛的一部分。如果你将“n”更改为“nx”（并删除$），你会看到“n ... n”和“x ... a”

简短而完整的例子：

using System;
using System.Text.RegularExpressions;

public class Test
{
    static void Main(string[] args)
    {
        string text = "ignored a      bc       d";
        Regex regex = new Regex(@"[a-z][\s]{4,}[a-z]");
        foreach (Match match in regex.Matches(text))
        {
            Console.WriteLine(match);
        }
    }
}

结果：

a      b
c       d

Answer 3

I just do not understand why is that..

我认为第一场比赛消耗的'为什么'是为了防止像"\\w+s"这样的正则表达式，旨在让每个以's'结尾的单词返回“ts”，“ats”和“猫“与”猫“相匹配。

Answer 4

正则表达式机制一个匹配，如果你想要更多，你必须在第一场比赛后自己重启。

正则表达式实例找不到多个匹配，即使它存在

4 个答案: