网站登录和刮HTML

时间:2013-08-27 15:01:23

标签: c# login web-scraping httpwebrequest

我有点卡在这里。我正在忙着制作一个从网站上读取数据的Windows应用程序。然而de网站首先需要登录,而我似乎无法获得通行证。我对编程很新,所以我希望有人知道解决方案。

这是我用来登录的代码:

private void btnLogin2_Click(object sender, EventArgs e)
    {
        HttpWebRequest request = WebRequest.Create(LoginPageURL) as HttpWebRequest;
        request.KeepAlive = true;
        request.Method = "POST";
        request.ContentType = "application/x-www-form-urlencoded";
        request.UserAgent = "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.2 Safari/537.36";
        string postData = "j_username=" + number + "&j_password=" + password;
        byte[] dataBytes = UTF8Encoding.UTF8.GetBytes(postData);
        request.ContentLength = dataBytes.Length;
        using (Stream postStream = request.GetRequestStream())
        {
            postStream.Write(dataBytes, 0, dataBytes.Length);
        }
        HttpWebResponse httpResponse = request.GetResponse() as HttpWebResponse;
        request = WebRequest.Create(Page2URL) as HttpWebRequest;
        request.UserAgent = "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.2 Safari/537.36";
        request.CookieContainer = new CookieContainer();
        request.CookieContainer.Add(httpResponse.Cookies);
        request.Method = "GET";
        HttpWebResponse httpResponse2 = request.GetResponse() as HttpWebResponse;
        StreamReader stream = new StreamReader(httpResponse2.GetResponseStream(), System.Text.Encoding.UTF8);
        string result = stream.ReadToEnd();
        stream.Close();

        tbOutput2.Text = result;



    }

关键是我在登录后获得了页面的HTML(Page2URL)。但我一直从登录页面获取HTML。

1 个答案:

答案 0 :(得分:0)

您正在为第一次回复的请求添加Cookie:

request.CookieContainer.Add(httpResponse.Cookies);

响应中的cookie可能为空!要解决此问题,请从响应标头中读取Cookie值,并将其添加到下一个请求中,如下所示:

 string response_header_cookies = response.Headers.Get("Set-Cookie")    
 req.Headers.Add("Cookie",response_header_cookies); 

在大多数情况下,这是更有效的方式。希望这可以帮助! 资料来源:msdn