vb.net htmlagilitypack selectnode循环

时间:2016-03-11 17:16:45

标签: arrays vb.net loops html-agility-pack

我制作了一系列链接

Dim horof As String = "A B C D"
    Dim alphabarray As String() = horof.Split(New Char() {" "c})
    Dim urls() As String = alphabarray.Select(Function(o) "http://somelink/list-" & o).ToArray()

输出就像这样

http://somelink/list-A
http://somelink/list-B
http://somelink/list-C
http://somelink/list-D

瘦我为每个链接做webrequest,如下所示:

 For i As Int32 = 0 To urls.Length - 1
        Dim wRequest As WebRequest
        Dim WResponse As WebResponse
        wRequest = FtpWebRequest.Create(urls(i))
        WResponse = wRequest.GetResponse
        Dim SR As StreamReader
        SR = New StreamReader(WResponse.GetResponseStream)
        urls(i) = SR.ReadToEnd
 Next

现在我有urls字符串数组中所有链接的html源代码 我想从数组中的每个html源使用htmlagilitypack到selectnodes

Dim htmlDoc As New HtmlDocument()
htmlDoc.LoadHtml(urls) 
Dim wantednode = htmlDoc.DocumentNode.SelectNodes("Xpath")

但它没有用

我尝试在同一个循环中播放

        Dim htmlDoc As New HtmlDocument()
        Dim wantednode As HtmlNodeCollection
For i As Int32 = 0 To urls.Length - 1
        Dim wRequest As WebRequest
        Dim WResponse As WebResponse
        wRequest = FtpWebRequest.Create(urls(i))
        WResponse = wRequest.GetResponse
        Dim SR As StreamReader
        SR = New StreamReader(WResponse.GetResponseStream)
        urls(i) = SR.ReadToEnd
        htmlDoc.Load(urls(i))
        wantednode = htmlDoc.DocumentNode.SelectNodes("Xpath")
next

这也没有用 如何制作wantednode = htmlDoc.DocumentNode.SelectNodes("Xpath")的循环 对于urls数组中的每个htmlcode

urls数组中的每个html代码都来自

        <body>
          <div class="list_body">

            <ul class="listing">

                <li>
                    <a href="http://wanted1.com" title="">title1 </a>                                               
                     </li>
                <li>
                    <a href="http://wanted2.com" title="">title2  </a>                             
                     </li>
                <li>
                    <a href="http://wanted3.com" title="">title3  </a>                                                
                     </li>
                <li>
                    <a href="http://wanted4.com" title="">title4   </a>                                                                       
                     </li>
                <li>
                    <a href="http://wanted5.com" title="">title5  </a>                                                                                               
                     </li>
                <li>
                    <a href="http://wanted6.com" title="">title6   </a>                                                                                                
                     </li>
            </ul>

          </div>
       </body>

我希望在http://wanted2.com

中的每个字符串中都有urls个链接

2 个答案:

答案 0 :(得分:1)

这是我使用的一些库代码:

Public Function Web_Request_Response(URL As String) As String
    Try
        Dim myRequest As HttpWebRequest
        Dim myResponse As HttpWebResponse
        Dim sr As StreamReader
        Dim sResponse As String = ""
        myRequest = CType(WebRequest.Create(URL), HttpWebRequest)
        myResponse = CType(myRequest.GetResponse(), HttpWebResponse)
        sr = New StreamReader(myResponse.GetResponseStream())
        sResponse = sr.ReadToEnd.ToString
        Return sResponse
    Catch ex As Exception
        LogMsgBox(ex, ex.Message, , "WebRequest_Responce Error")
        Return ""
    End Try
End Function

与您的通话略有不同。也许与类型铸造有关? FTP与HTTP?

很好地使用Linq。

您可以保存一些输入:

Dim alphabarray As String() = horof.Split(New Char() {" "c})
--- same as ---
Dim alphabarray As String() = horof.Split({" "c})
 or
Dim alphabarray As String() = horof.Split(" ")

答案 1 :(得分:1)

您应该使用 HtmlDocument.LoadHtml() 而不是HtmlDocument.Load(),因为您要从HTML字符串填充HtmlDocument

'urls(i) value has been replaced with HTML string by the following line..
urls(i) = SR.ReadToEnd
'..so next, you need to use `LoadHtml()`
htmlDoc.LoadHtml(urls(i))

wantednode = htmlDoc.DocumentNode.SelectNodes("Xpath")