Question

文本遵循这种模式

<tr class="text" (any sequence of characters here, except ABC)ABC(any sequence of characters here, except ABC)
<tr class="text" (any sequence of characters here, except ABC)ABC(any sequence of characters here, except ABC)
<tr class="text" (any sequence of characters here, except ABC)ABC(any sequence of characters here, except ABC)
<tr class="text" (any sequence of characters here, except ABC)ABC(any sequence of characters here, except ABC)

所以基本上上面的行（可能包括换行符）可能会多次重复，并且想法是在ABC之后立即检索前3个字符。

我尝试了

的正则表达式

 \<tr class="text" [.\n]+ABC(?<capture>[.]{3})

但他们都失败了。有人可以给我一个暗示吗？

Answer 1

你有效地逃脱了通配符，成为一个文字时期。只需使用

\<tr class="text" .+?ABC(?<capture>.{3})

请务必使用RegexOptions.Singleline，以便.也与换行符匹配！

However, you shouldn't actually use regular expressions at all.而是使用DOM解析器。我已经看到了HTML Agility Pack被定期推荐用于.NET。

Answer 2

这是一个正则表达式，它将捕获字符串中"ABC"之后的前3个字母

".+ABC(...)"

在c＃中，您的匹配将包含一组组，其中一组将是3个字母

请确保您的字符串中没有任何不期望的"ABC"，因为这会弄乱它

此代码

public static void Main()
{
    Regex regex = new Regex(".+ABC(...)");

    Match match = regex.Match("baln390nABCqlcln");
    foreach (Group group in match.Groups)
    {
        Console.WriteLine(group.Value);
    }
}

给出了这个输出

baln390nABCqlc
qlc
Press any key to continue . . .

Answer 3

<tr class="text" .+ABC(?<capture>.{3})

与RegexOptions.Singleline结合使用（以便.匹配换行符。）

哪个正则表达式会捕获这个序列？

3 个答案: