读取GetResponseStream的行

时间:2013-02-01 02:16:59

标签: vb.net httpwebrequest web-scraping

我正在使用以下内容获取网页:

Dim request As HttpWebRequest = DirectCast(WebRequest.Create(url), HttpWebRequest)
request.CookieContainer = logincookie
Dim response As HttpWebResponse = DirectCast(request.GetResponse(), HttpWebResponse)
Dim reader As New StreamReader(response.GetResponseStream())
Dim thesource As String = reader.ReadToEnd

我正在尝试从源中提取数据。 thesource 成功保存了页面的源代码,但是,我无法使用For Each循环遍历每一行,因为它只响应每个字符而不是每行。

有人可以建议如何解决这个问题吗? 多谢你们, 斯坦。

1 个答案:

答案 0 :(得分:0)

我看到三个选项。

选项1.在While循环中逐行读取而不是For Each。如果一切都在同一个函数中,请使用此方法。

Dim request As HttpWebRequest = DirectCast(WebRequest.Create(url), HttpWebRequest)
request.CookieContainer = logincookie
Dim response As HttpWebResponse = DirectCast(request.GetResponse(), HttpWebResponse)
Dim reader As New StreamReader(response.GetResponseStream())
Dim line As String = reader.ReadLine()

While line IsNot Nothing

    'Contents of your For Each loop go here

    line = reader.ReadLine()
End While

选项2:您至少拥有Visual Studio 2012并希望从此函数返回结果,以便您可以在调用此代码的代码中使用For Each循环。

Public Iterator Function GetUrl(ByVal url As String) As IEnumerable(Of String)

    Dim request As HttpWebRequest = DirectCast(WebRequest.Create(url), HttpWebRequest)
    request.CookieContainer = logincookie
    Dim response As HttpWebResponse = DirectCast(request.GetResponse(), HttpWebResponse)
    Dim reader As New StreamReader(response.GetResponseStream())
    Dim line As String = reader.ReadLine()

    While line IsNot Nothing

        Yield line
        line = reader.ReadLine()
    End While
End Function

这样称呼:

For Each line As String In GetUrl("http://example.com")
    '...
Next line

选项3:您希望从函数返回,与前一个选项一样,但不能使用新的迭代器语言功能。

Public Sub GetUrl(ByVal url As String, ByVal lineAction As Action(Of String))

    Dim request As HttpWebRequest = DirectCast(WebRequest.Create(url), HttpWebRequest)
    request.CookieContainer = logincookie
    Dim response As HttpWebResponse = DirectCast(request.GetResponse(), HttpWebResponse)
    Dim reader As New StreamReader(response.GetResponseStream())
    Dim line As String = reader.ReadLine()

    While line IsNot Nothing
        lineAction(line)
        line = reader.ReadLine()
    End While
End Sub

这样称呼:

GetUrl("http://example.com", _
   Sub(l)
       'Do stuff here
   End Sub)