如何让vb.net从网页添加特定div类中的所有链接

时间:2015-06-11 08:38:24

标签: html vb.net facebook html-agility-pack

我想在列表框中添加项目:

https://www.facebook.com/XXXXXXX

https://www.facebook.com/XXXXXXX

文件:

<div class="fsl fwb fcb">
 <a href="https://www.facebook.com/XXXXXXX?fref=pb&hc_location=friends_tab"
<div class="fsl fwb fcb">
 <a href="https://www.facebook.com/XXXXXXX?fref=pb&hc_location=friends_tab"
<div class="fsl fwb fcb">
 <a href="https://www.facebook.com/XXXXXXX?fref=pb&hc_location=friends_tab"

1 个答案:

答案 0 :(得分:1)

这可以按预期工作:

Dim html = File.ReadAllText("Path")
Dim doc = New HtmlAgilityPack.HtmlDocument()
doc.LoadHtml(html)

Dim anchorTexts As New List(Of String)
Dim divNodes = doc.DocumentNode.SelectNodes("//div[@class='fsl fwb fcb']")
If Not divNodes Is Nothing Then
    For Each div In divNodes
        For Each anchorNode In div.SelectNodes("//a")
            Dim href As String = anchorNode.GetAttributeValue("href", "")
            If Not String.IsNullOrEmpty(href) Then
                anchorTexts.Add(href)
            End If
        Next
    Next
End If