获取用户名/密码身份验证背后的数据

时间:2011-04-19 10:20:56

标签: c# cookies screen-scraping wget

我想从论坛下载一些数据。包含数据的页面仅对注册用户可见。这是一个包含用户数据的示例网页;

http://www.bikeforums.net/member.php/227664-StackOverflow

我想使用wget或C#获取数据。我尝试通过Firefox登录,然后将cookie文件(希望包含登录信息)传递给wget。这更像是一个临时的黑客而不是一个真正的解决方案,但它仍然失败了。我该怎么做呢?

我设置了一个帐户,用于测试是否有用。

用户:StackOverflow

通过:so123

1 个答案:

答案 0 :(得分:0)

使用firebug,您可以轻松获取登录页面的POST数据,并使用它创建WebRequest并登录论坛。

服务器创建用于身份验证的cookie,我们可以在论坛页面的下一个请求中使用此cookie,以便服务器可以验证请求并返回所有数据。

这里我测试了一个实现这种机制的简单控制台应用程序。

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Globalization;
using System.Web;
using System.Net;
using System.Xml;
using System.Xml.XPath;
using System.Xml.Linq;
using System.IO;



namespace ConsoleApplication2
{
    class Program
    {
        static void Main(string[] args)
        {
            CookieContainer cookieContainer = new CookieContainer();
            HttpWebRequest wpost = (HttpWebRequest) HttpWebRequest.Create("http://www.bikeforums.net/login.php?do=login");
            wpost.CookieContainer = cookieContainer;
            wpost.Method = "POST";
            string postData = "do=login&vb_login_md5password=d93bd4ce1af6a9deccaf0ea844d6c05d&vb_login_md5password_utf=d93bd4ce1af6a9deccaf0ea844d6c05d&s=&securitytoken=guest&url=%2Fmember.php%2F227664-StackOverflow&vb_login_username=StackOverflow&vb_login_password=";
            byte[] byteArray = Encoding.UTF8.GetBytes(postData);
            // Set the ContentType property of the WebRequest.
            wpost.ContentType = "application/x-www-form-urlencoded";
            // Set the ContentLength property of the WebRequest.
            wpost.ContentLength = byteArray.Length;
            // Get the request stream.
            System.IO.Stream dataStream = wpost.GetRequestStream();
            // Write the data to the request stream.
            dataStream.Write(byteArray, 0, byteArray.Length);
            // Close the Stream object.
            dataStream.Close();
            // Get the response.
            HttpWebResponse response = (HttpWebResponse) wpost.GetResponse();

            // Request 
            wpost = (HttpWebRequest)WebRequest.Create("http://www.bikeforums.net/member.php/227664-StackOverflow");

            //Assing the cookies created on the server to the new request
            wpost.CookieContainer = cookieContainer;
            wpost.Method = "GET";
             response = (HttpWebResponse)wpost.GetResponse();

             Stream receiveStream = response.GetResponseStream();
             // Pipes the stream to a higher level stream reader with the required encoding format. 
             StreamReader readStream = new StreamReader(receiveStream, Encoding.UTF8);
            //Display the result to console...
             Console.WriteLine(readStream.ReadToEnd());
             response.Close();
             readStream.Close();

            Console.Read();

        }
    }
}