我正在使用HTML Agility Pack并在子级h4元素中搜索div
class="fileHeader"
"RelayClinical Patient Education with Animations Install zip"
"href"
。找到后,我想捕获该特定块的锚标记内的<div class="fileHeader" id="fileHeader_7311111">
<h4 class="collapsed">RelayClinical Patient Education with Animations Install zip</h4>
<div class="defaultMethod">
<a class="buttonGrey" href="https://mckc-esd.subscribenet.com/cgi-bin/download?rid=2511740931&rp=DTM20130905162949MzcyODIwNjM0" title="Clicking this link will open a new window." rel="noreferrer">
HTTPS Download
</a>
</div>
</div>
属性。我怎么能得到它?
HTML来源
HtmlNodeCollection fileHeaderNodes = bodyNode.SelectNodes("//div[@class='fileHeader']//h4");
foreach (HtmlNode fileHeader in fileHeaderNodes)
{
if (fileHeader.InnerText.Trim() == "RelayClinical Patient Education with Animations Install zip")
{
HtmlNodeCollection fileHeaderNodes = bodyNode.SelectNodes("//div[@class='fileHeader']//h4");
foreach (HtmlNode fileHeader in fileHeaderNodes)
{
if (fileHeader.InnerText.Trim() == "RelayClinical Patient Education with Animations Install zip")
{
foreach (HtmlNode link in fileHeader.SelectNodes("//a[@href]"))
{
// extract the link and put in dataUrl var
if ((link.InnerText.Trim() == "HTTPS Download") && isFound == true)
{
count++;
// select all a tags (html anchor tags) that have a href attribute
HtmlAttribute att = link.Attributes["href"];
dataUrl = att.Value;
}
}
}
}
}
}
代码
{{1}}
答案 0 :(得分:0)
不要选择h4
元素,而是直接选择a
元素。然后,您可以获取href
属性。
var h4Text = "RelayClinical Patient Education with Animations Install zip";
var xpath = String.Format(
"//div[@class='fileHeader' and h4='{0}']/div[@class='defaultMethod']/a",
h4Text
);
var anchor = doc.DocumentNode.SelectSingleNode(xpath);
if (anchor != null)
{
var attr = anchor.GetAttributeValue("href", null);
// do stuff with attr
}