Question

我想从论坛页面中过滤一些论坛条目内容。论坛条目位于两个blockquote元素之间（如正则表达式中所示）。我想用Regex过滤内容。这是我正在使用的代码：

string pattern = @"(<blockquote class=\"postcontent restore \">)(.*?)(</blockquote>)";
Regex test = new Regex(pattern,  RegexOptions.IgnorePatternWhitespace);
MatchCollection m = test.Matches(downloadString);
var arr = m
  .Cast<Match>()
  .Select(n => n.Value)
  .ToArray();

foreach (string match in arr)
    {
         Console.WriteLine(match);
    }
Console.ReadLine();

我有这个例子：

<blockquote class="postcontent restore ">
  <br>
    Some Stuff
  <br>
    Some Stuff #2
  <br>
</blockquote>

我遇到的问题是返回的数组是空的。知道什么可能是错的吗？我想这是因为空白，但我不知道如何“忽略”它们。

Answer 1

。匹配除新行之外的任何字符。

您可以使用它来包含换行符：

(<blockquote class=\"postcontent restore \">)(\n*.*)(<\/blockquote>)

你的模式也没有使用转义为双qoute和正斜杠，所以这里是：

编辑：抱歉。 @是有的，所以最终的版本应该是:) 编辑2：完整测试的源代码。您有责任检查IsMatch或空引用

string pattern = @"(<blockquote class=\""postcontent restore \"">)+((\n*)(.*))+(</blockquote>)";
Regex test = new Regex(pattern);
MatchCollection matches = test.Matches(downloadString);
StringBuilder xmlContentBUilder = new StringBuilder();
foreach (Capture capture in matches[0].Groups[2].Captures)
{
    xmlContentBUilder.Append(capture);
}
Console.WriteLine(xmlContentBUilder);

Regex.Matches返回空结果

1 个答案: