如何从cookie获取Web会话?

时间:2011-11-17 01:19:57

标签: .net vb.net session cookies httpwebrequest

我正在尝试抓取网页,但为了发布数据,我需要一个网络会话ID,如

  

web_session = HQJ3G1GPAAHRZGFR

我如何获得该ID?

到目前为止我的代码是:

Private Sub test()

    Dim postData As String = "web_session=HQJ3G1GPAAHRZGFR&intext=O&term_code=201210&search_type=A&keyword=&kw_scope=all&kw_opt=all&subj_code=BIO&crse_numb=205&campus=*&instructor=*&instr_session=*&attr_type=*&mon=on&tue=on&wed=on&thu=on&fri=on&sat=on&sun=on&avail_flag=on" '/BANPROD/pkgyc_yccsweb.P_Results 
    Dim tempCookie As New CookieContainer
    Dim encoding As New UTF8Encoding
    Dim byteData As Byte() = encoding.GetBytes(postData)

    System.Net.ServicePointManager.SecurityProtocol = Net.SecurityProtocolType.Ssl3
    Try

        tempCookie.GetCookies(New Uri("https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Results"))
        'postData="web_session=" & tempCookie.

        Dim postReq As HttpWebRequest = DirectCast(WebRequest.Create("https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Results"), HttpWebRequest)
        postReq.Method = "POST"
        postReq.KeepAlive = True
        postReq.CookieContainer = tempCookie
        postReq.ContentType = "application/x-www-form-urlencoded"


        postReq.UserAgent = "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.0.3705; Media Center PC 4.0; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)"
        postReq.ContentLength = byteData.Length
        Dim postreqstream As Stream = postReq.GetRequestStream
        postreqstream.Write(byteData, 0, byteData.Length)
        postreqstream.Close()
        Dim postresponse As HttpWebResponse
        postresponse = DirectCast(postReq.GetResponse, HttpWebResponse)
        tempCookie.Add(postresponse.Cookies)

        Dim postresreader As New StreamReader(postresponse.GetResponseStream)
        Dim thepage As String = postresreader.ReadToEnd
        MsgBox(thepage)
    Catch ex As WebException
        MsgBox(ex.Status.ToString & vbNewLine & ex.Message.ToString)
    End Try

End Sub

1 个答案:

答案 0 :(得分:2)

问题是tempCookie.GetCookies()没有做你认为它做的事情。实际上它实际上做的是将预先存在的CookieCollection过滤为仅包含所提供URL的cookie。相反,您需要做的是首先创建一个页面请求,该页面将为您提供此会话令牌,然后对您的数据进行实际请求。因此,首先在P_Search处请求该页面,然后重新使用该CookieContainer绑定到该请求的请求并发布到P_Results

不是HttpWebRequest对象,而是让我指向WebClient班级和my post here about extending it to support cookies。你会发现你可以大大简化你的代码。下面是一个完整的VB2010 WinForms应用程序,显示了这一点。如果您仍想使用HttpWebRequest对象,这至少应该让您了解需要做什么:

Option Strict On
Option Explicit On

Imports System.Net

Public Class Form1

    Private Sub Form1_Load(sender As System.Object, e As System.EventArgs) Handles MyBase.Load
        ''//Create our webclient
        Using WC As New CookieAwareWebClient()
            ''//Set SSLv3
            System.Net.ServicePointManager.SecurityProtocol = Net.SecurityProtocolType.Ssl3
            ''//Create a session, ignore what is returned
            WC.DownloadString("https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Search")
            ''//POST our actual data and get the results
            Dim S = WC.UploadString("https://taylor.yc.edu/BANPROD/pkgyc_yccsweb.P_Results", "POST", "term_code=201130&search_type=K&keyword=math")
            Trace.WriteLine(S)
        End Using
    End Sub
End Class

Public Class CookieAwareWebClient
    Inherits WebClient

    Private cc As New CookieContainer()
    Private lastPage As String

    Protected Overrides Function GetWebRequest(ByVal address As System.Uri) As System.Net.WebRequest
        Dim R = MyBase.GetWebRequest(address)
        If TypeOf R Is HttpWebRequest Then
            With DirectCast(R, HttpWebRequest)
                .CookieContainer = cc
                If Not lastPage Is Nothing Then
                    .Referer = lastPage
                End If
            End With
        End If
        lastPage = address.ToString()
        Return R
    End Function
End Class