我正在编写一个Web服务,用于抓取并返回我们网站特定页面的HTML。该网站需要登录,所以我首先尝试将登录信息发布到登录页面,这样我就可以获得我需要的cookie来访问我想要的页面。然后,我试图抓住我真正想要的页面。
这是我的代码:
Dim http As HttpWebRequest = TryCast(WebRequest.Create("http://www.mywebsite.com/loginpage"), HttpWebRequest)
http.KeepAlive = True
http.Method = "POST"
http.ContentType = "application/x-www-form-urlencoded"
Dim postData As String = "My post data"
Dim dataBytes As Byte() = UTF8Encoding.UTF8.GetBytes(postData)
http.ContentLength = dataBytes.Length
Using postStream As Stream = http.GetRequestStream()
postStream.Write(dataBytes, 0, dataBytes.Length)
End Using
Dim httpResponse As HttpWebResponse = TryCast(http.GetResponse(), HttpWebResponse)
http = TryCast(WebRequest.Create("http://www.mywebsite.com/desiredpage"), HttpWebRequest)
http.CookieContainer = New CookieContainer()
http.CookieContainer.Add(httpResponse.Cookies)
Dim httpResponse2 As HttpWebResponse = TryCast(http.GetResponse(), HttpWebResponse)
Using httpResponse2
Using reader As New StreamReader(httpResponse2.GetResponseStream())
Dim html As String = reader.ReadToEnd()
Return html
End Using
End Using
我的问题是:我只是收到登录页面的HTML,而不是成功登录并返回所需的cookie。这并不奇怪,因为如果cookie不存在,mywebsite.com/desiredpage
会重定向到登录页面。
更新
Wireshark告诉我该网站正在返回6个Cookie:.ASPXANONYMOUS
,language
,USERNAME_CHANGED
,authentication
,.DOTNETNUKE
和returnurl
。我已经确认前3个存储在http.CookieContainer
中,但其他3个不存储。
剩下的3发生了什么?
答案 0 :(得分:0)
我的问题在于我将cookie从第一个请求传递到第二个请求。
创建单独的CookieContainer
并在两个单独的请求中使用它解决了问题:
Dim http As HttpWebRequest = TryCast(WebRequest.Create("http://www.mywebsite.com/loginpage"), HttpWebRequest)
http.KeepAlive = True
http.Method = "POST"
http.ContentType = "application/x-www-form-urlencoded"
Dim postData As String = "My post data"
Dim dataBytes As Byte() = UTF8Encoding.UTF8.GetBytes(postData)
http.ContentLength = dataBytes.Length
Dim myCookies As CookieContainer = New CookieContainer()
http.CookieContainer = myCookies
Using postStream As Stream = http.GetRequestStream()
postStream.Write(dataBytes, 0, dataBytes.Length)
End Using
Dim httpResponse As HttpWebResponse = TryCast(http.GetResponse(), HttpWebResponse)
Dim http2 As HttpWebRequest = TryCast(WebRequest.Create("http://www.mywebsite.com/desiredpage"), HttpWebRequest)
http2.CookieContainer = myCookies
Dim httpResponse2 As HttpWebResponse = TryCast(http2.GetResponse(), HttpWebResponse)
Using httpResponse2
Using reader As New StreamReader(httpResponse2.GetResponseStream())
Dim html As String = reader.ReadToEnd()
Return html
End Using
End Using