Question

我有这段HTML：

</TABLE>
<HR>
<font size="+1"> Method and apparatus for re-sizing and zooming images by operating directly
     on their digital transforms
</font><BR>

我正在尝试捕获font标记内的文字。这是我的正则表达式：

  Regex regex = new Regex("</TABLE><HR><font size=\"+1\">(?<title>.*?)</font><BR>", RegexOptions.Singleline | RegexOptions.IgnoreCase);

        Match match = regex.Match(data);

        string title = match.Groups["title"].Value;

然而我得到空头衔。谁能告诉我我错过了什么？

Answer 1

你的正则表达式;

new Regex("</TABLE><HR><font size=\"+1\">(?<title>.*?)</font><BR>"

格式不正确，因为+在正则表达式中具有不同的含义。

根据您的输入字符串，您想要的是将其转义;

new Regex("</TABLE><HR><font size=\"\\+1\">(?<title>.*?)</font><BR>"

另外，如果你想用换行符匹配字符串，你必须给出一个通配符来忽略它们，所以这可能更像你正在尝试做的事情;

new Regex("</TABLE>.*<HR>.*<font size=\"\\+1\">(?<title>.*?)</font>.*<BR>"

正则表达式命名组问题

1 个答案: