如何以编程方式检索html重定向的网页?

时间:2009-03-10 14:47:29

标签: c# httpwebrequest

我已完成此代码登录,检索并显示网页:

  // login info array
        string postData = "user_name=tler";
        postData += "&user_password=lodvader";
        byte[] data = Encoding.ASCII.GetBytes(postData);

        // web request
        WebRequest req = WebRequest.Create("http://www.lol.com/login.php");
        req.Method = "POST";
        req.ContentType = "application/x-www-form-urlencoded";
        req.ContentLength = data.Length;

        // stream response to string
        Stream newStream = req.GetRequestStream();
        newStream.Write(data, 0, data.Length);
        newStream.Close();
        StreamReader reader = new StreamReader(req.GetResponse().GetResponseStream(), Encoding.GetEncoding("iso-8859-1"));

        string responseString = reader.ReadToEnd();

        // retrieve text within title
        Regex rx = new Regex(@"(?<=<title>).+?(?=</title>)");

        var variable = rx.Matches(responseString);

        // output
        Console.WriteLine(variable[0]);

        Console.ReadLine();

但是,登录后的以下页面是一个html重定向,如:

<meta http-equiv="refresh" content="3; URL="bb.php">

如何关注此链接并检索下一页?

4 个答案:

答案 0 :(得分:2)

只需将新的WebRequest发送到bb.php文件即可。确保您使用相同的CookieContainer,因为我认为login.php使用基于cookie的会话来记住您。查看HttpWebRequest.CookieContainer属性。这需要您将WebRequest强制转换为HttpWebRequest。

补充:(无法在评论中编写示例代码。)

我只是在没有校对的情况下编写代码......

var cookies = new CookieContainer(); 

var firstReq = (HttpWebRequest)WebRequest.Create(".../login.php");
firstReq.CookieContainer = cookies;

var secondReq = (HttpWebRequest)WebRequest.Create(".../bb.php");
secondReq.CookieContainer = cookies

答案 1 :(得分:2)

我已经找到了完成它的时间,这里的回复(我试图尽可能清楚):

        // Cookie for our session
        var cookieContainer = new CookieContainer();

        // Encode post variables
        ASCIIEncoding encoding=new ASCIIEncoding();
        byte[] loginDataBytes = encoding.GetBytes("user_name=belaz&user_password=123");

        // Prepare our login HttpWebRequest
        HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://blabla.fr/verify.php");
        request.Method = "POST";
        request.ContentType = "application/x-www-form-urlencoded";
        request.CookieContainer = cookieContainer;
        request.ContentLength = loginDataBytes.Length;

        // Write encoded post variable to the stream
        Stream newStream = request.GetRequestStream();
        newStream.Write(loginDataBytes, 0, loginDataBytes.Length);
        newStream.Close();

        // Retrieve HttpWebResponse
        HttpWebResponse response = (HttpWebResponse)request.GetResponse();

        // Link the response cookie to the domain
        cookieContainer.Add(new Uri("http://blabla.fr/"),response.Cookies);

        // Prepare our navigate HttpWebRequest, and set his cookie.
        HttpWebRequest requestProfile = (HttpWebRequest)WebRequest.Create("http://blabla.fr/bb.php");
        requestProfile.CookieContainer = cookieContainer;

        // Retrieve HttpWebResponse
        HttpWebResponse responseProfile = (HttpWebResponse)requestProfile.GetResponse();

        // Retrieve stream response and read it to end
        Stream st = responseProfile.GetResponseStream();
        StreamReader sr = new StreamReader(st);
        string buffer = sr.ReadToEnd();

答案 2 :(得分:0)

HttpWebRequest的属性名为 AllowAutoRedirects 。将此设置为true。还有一个名为 MaximumAutomaticRedirections 的属性。将其设置为某个允许值,以确保遵循所有这些值。

答案 3 :(得分:0)

您无法轻松完成,因为元标记由客户端读取并执行。

在这种情况下,当您使用HttpWebRequest时,请求不关心文本可能具有的功能。

所以你需要在URL属性(bb.php)中对页面做另一个请求。

-

如果服务器进行了重定向,则不会出现问题。