使用HttpWebRequest访问多个页面

时间:2012-04-30 16:35:44

标签: .net vb.net httpwebrequest httpwebresponse

我正在尝试筛选一个需要POST登录身份验证的网站。我可以在第一次请求时进行身份验证,但是当我尝试转回并点击下一页时,我会重定向到登录页面(基本上就是说我没有登录)。

代码:

Public Function GetPage(ByVal PageName As String, ByVal UserName As String, ByVal Password As String) As String
    Dim ReturnString As String = ""
    Dim Cookies As New CookieContainer
    Dim AuthURI As Uri = New Uri(AuthURL)
    Cookies.GetCookieHeader(AuthURI)
    Cookies.GetCookies(AuthURI)

    'Set Header/Meta Info
    System.Net.ServicePointManager.SecurityProtocol = SecurityProtocolType.Ssl3
    Dim request As HttpWebRequest = HttpWebRequest.Create(AuthURL)
    request.Method = "POST"
    request.CookieContainer = Cookies
    request.UserAgent = "Mozilla/5.0 (Windows; U;Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1"

    'Set POST Info
    Dim postData As String = "userName=" & HttpUtility.UrlEncode(UserName) & "&password=" & HttpUtility.UrlEncode(Password)
    Dim byteArray As Byte() = Encoding.UTF8.GetBytes(postData)
    request.ContentType = "application/x-www-form-urlencoded"
    request.ContentLength = byteArray.Length

    'Write to the request stream
    Dim dataStream As Stream = request.GetRequestStream()
    dataStream.Write(byteArray, 0, byteArray.Length)
    dataStream.Close()

    ' Get the response.
    Dim response As HttpWebResponse = request.GetResponse()
    dataStream = response.GetResponseStream()
    Dim reader As New StreamReader(dataStream)
    Dim responseFromServer As String = reader.ReadToEnd()
    ReturnString = responseFromServer

    'Append cookie data
    For Each c As Cookie In response.Cookies
        Cookies.Add(c)
    Next

    ' Clean up the streams.
    reader.Close()
    dataStream.Close()
    response.Close()

    'Bail on fail
    If ReturnString.Contains("Login failed") Then Return Nothing

    'Generate new request
    request = HttpWebRequest.Create(URLStub & PageName)
    request.Method = "POST"
    request.UserAgent = "Mozilla/5.0 (Windows; U;Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1"
    request.CookieContainer = Cookies

    'Set POST Info
    postData = "userName=" & HttpUtility.UrlEncode(UserName) & "&password=" & HttpUtility.UrlEncode(Password)
    byteArray = Encoding.UTF8.GetBytes(postData)
    request.ContentType = "application/x-www-form-urlencoded"
    request.ContentLength = byteArray.Length

    'Write to the request stream
    dataStream = request.GetRequestStream()
    dataStream.Write(byteArray, 0, byteArray.Length)
    dataStream.Close()

    'Get the response.
    response = request.GetResponse
    dataStream = response.GetResponseStream
    reader = New StreamReader(dataStream)
    responseFromServer = reader.ReadToEnd
    ReturnString = responseFromServer

    'Clean up the streams.
    reader.Close()
    dataStream.Close()
    response.Close()

    Return ReturnString
End Function

此代码模仿用PHP编写的另一个页面。 PHP代码页工作正常,我使用的是与PHP版相同的URL,所以我大约99%确定问题不在于服务器端。

此外,我已经在这个问题上搜索了其他帖子,似乎我正在做一切正确的语法,但也许我错过了一些小而愚蠢的事情?

有什么想法吗?我已经打了好几天了。先感谢您。 :)


编辑:我一直在玩会话cookie,并且一直在比较初始登录和实际请求。这就是我所拥有的:

FIRST:
ORA_WX_SESSION: "1FAA2AB1EF40DF4BC291DD3326F1DC3C596F56CF-2#3"
JSESSIONID: a2c38b128e1e54051a2c95c5a3a1e3a4cb0cb5b7ba74cd260aaec531856d722f.e34SahmMbNaMe34Sa3yPaN8Sc40
XYZCustomerServiceUserName: (login-name)
XYZ: d53f8dcd87b861a61d99ac21ec53bb2b

LAST:
ORA_WX_SESSION: 1FAA2AB1EF40DF4BC291DD3326F1DC3C596F56CF-2#3
JSESSIONID: a2c38b128e1e54051a2c95c5a3a1e3a4cb0cb5b7ba74cd260aaec531856d722f.e34SahmMbNaMe34Sa3yPaN8Sc40

我注意到两件事......首先,ORA_WX_SESSION的引号被剥离(即使我试图在响应和请求之间强制它们)。 XYZCustomerServiceUserName和XYZ值也被剥离。除此之外,会话信息在两次连接尝试之间是相同的。

1 个答案:

答案 0 :(得分:0)

让您退出的东西可以是服务器想要的任何东西,而您还没有提供。我们使用System.Net类进行了一些自动化,事实证明,您可以做的最好的事情是使用WatinSelenium。我们正在使用Watin。通过使用这样的工具,您不会受到服务器端的简单更改的影响。