<div class="outer">
<div class="divOne"></div>
<div class="divContent">
<h3>SomeTitle</h3>
<h4>SomeSubtitle</h4>
<ul>
<li><a href="/someUrlx.htm">SomeUrl</a>
<span> Nr of records under this url </span>
</li>
</ul>
<h4>Some Other Subtitle</h4>
<ul>
<li><a href="/someUrlx.htm">SomeUrl</a>
<span> Nr of records under this url </span>
</li>
</ul>
</div>
</div>
再一次,我想获取上面的html结构
下的所有无序列表项我可以使用
获取divContent类内容var regs = htmlDoc.DocumentNode.SelectSingleNode(@"//div[@class='outer']");
var descendant = regs.Descendants()
.Where(x => x.Name == "div" && x.Attributes["class"].Value == "divContent")
.Select(x => x.OuterHtml);
现在我需要表达来获取ul li项目。
答案 0 :(得分:3)
这应该可以正常工作:
IEnumerable<string> listItemHtml = htmlDoc.DocumentNode.SelectNodes(
@"//div[@class='outer']/div[@class='divContent']/ul/li")
.Select(li => li.OuterHtml);
示例: https://dotnetfiddle.net/fnDPLB
根据以下评论进行更新:
如果您只想找到属于<li>
元素的<ul>
元素,这些元素是<h4>
元素的直接兄弟,其值为“SomeSubtitle”,这里的XPath表达式应该有效:
//div[@class='outer'] // Get div.outer
/div[@class='divContent'] // under that div, find div.divContent
/h4[text()='SomeSubtitle'] // under div.divContent, find an h4 with the value 'SomeSubtitle'
/following::ul[1]/li // Get the first ul following the h4 and then get its li elements.