Question

以下是我要解析的HTML源代码：

<a style='white-space: nowrap;' href='/AuthorStories-4931/dreamfall.htm'><img class='donoricon' alt='(Current Donor)'  title='(Current Donor)' src='http://static.tthf.me/images/donors/Current%20Donor.gif'/>dreamfall</a>

以下是我正在使用的代码：

authorLink = doc.DocumentNode.SelectSingleNode("//a[contains(@href, 'AuthorStories')]").OuterHtml;

这正确地抓取了链接，但它也捕获了img。我想要抓住的唯一部分是href片段。关于如何解析该特定部分的任何建议？

Answer 1

[几年内没有触及过HtmlAgilityPack，但这应该是正确的]

而不是OuterHtml，Attributes返回的节点上应该有一个SelectSingleNode数组，您应该可以从那里获得href。

使用HTMLAgilityPack抓取href的URL

1 个答案: