在Regex中获取组的价值

时间:2013-10-23 06:32:59

标签: c# regex regex-group

在C#中,我的正则表达式有以下模式:

string pattern = "<div class=\"alt\" title=\"[\\w\\s]+\"><strong>([\\w\\s]+)</strong></div>";

我创建一个Match对象,如下所示:

status = Regex.Match(html, pattern);

但是如果我在状态上调用.groups(),我会得到空白文本,即使匹配也是如此。我是否正确地提取了该组?

编辑:这是一些HTML,

          <tr>
            <td>
                    <div class="alt" title="Released to Manufacturing">
                            <strong>Released to Manufacturing</strong>

2 个答案:

答案 0 :(得分:0)

string strRegex = @"<div class=""alt"" title=""[\w\s]+""><strong>([\w\s]+)</strong></div>";
RegexOptions myRegexOptions = RegexOptions.IgnoreCase | RegexOptions.Multiline;
Regex myRegex = new Regex(strRegex, myRegexOptions);
string strTargetString = @"<div class=""alt"" title=""released""><strong>Released</strong></div>";

foreach (Match myMatch in myRegex.Matches(strTargetString))
{
    if (myMatch.Success)
    {
        var value = myMatch.Groups[1].Value;
    }
}

使用RegexHero验证

答案 1 :(得分:0)

正则表达式不用于解析html ..

使用像Htmlagilitypack

这样的html解析器
   HtmlDocument doc = new HtmlDocument();
   doc.Load(yourStream);
   var altElementValues= doc.DocumentNode
                            .SelectNodes("//div[@class='alt']/strong")
                            .Select(x=>x.InnerText);