在C#中,我的正则表达式有以下模式:
string pattern = "<div class=\"alt\" title=\"[\\w\\s]+\"><strong>([\\w\\s]+)</strong></div>";
我创建一个Match
对象,如下所示:
status = Regex.Match(html, pattern);
但是如果我在状态上调用.groups(),我会得到空白文本,即使匹配也是如此。我是否正确地提取了该组?
编辑:这是一些HTML,
<tr>
<td>
<div class="alt" title="Released to Manufacturing">
<strong>Released to Manufacturing</strong>
答案 0 :(得分:0)
string strRegex = @"<div class=""alt"" title=""[\w\s]+""><strong>([\w\s]+)</strong></div>";
RegexOptions myRegexOptions = RegexOptions.IgnoreCase | RegexOptions.Multiline;
Regex myRegex = new Regex(strRegex, myRegexOptions);
string strTargetString = @"<div class=""alt"" title=""released""><strong>Released</strong></div>";
foreach (Match myMatch in myRegex.Matches(strTargetString))
{
if (myMatch.Success)
{
var value = myMatch.Groups[1].Value;
}
}
使用RegexHero验证
答案 1 :(得分:0)
正则表达式不用于解析html ..
这样的html解析器 HtmlDocument doc = new HtmlDocument();
doc.Load(yourStream);
var altElementValues= doc.DocumentNode
.SelectNodes("//div[@class='alt']/strong")
.Select(x=>x.InnerText);