GetResponse()的StatusCode是正常的,但url重定向

时间:2017-10-01 08:17:42

标签: c# httpresponse url-redirection http-status-codes

执行以下代码行时:

string link = "https://finance.yahoo.com/r/4951f719-c8e1-3b1d-b4db-684ef6739b8e/iphone-x-5-reasons-to-wait?utm_source=yahoo&utm_medium=partner&utm_campaign=yahootix&partner=yahootix&yptr=yahoo&.tsrc=rss"
HttpWebRequest myHttpWebRequest=(HttpWebRequest)WebRequest.Create(link);    
myHttpWebRequest.MaximumAutomaticRedirections=100;
myHttpWebRequest.AllowAutoRedirect=true;
HttpWebResponse myHttpWebResponse=(HttpWebResponse)myHttpWebRequest.GetResponse();

我得到了StatusCode" OK"和ResponseUri仍然是相同的原始URL,而不是重定向的URL。 enter image description here

如何获取上次重定向的网址?

1 个答案:

答案 0 :(得分:1)

AllowAutoRedirect属性仅在HTTP协议重定向URL时(当服务器返回HTTP代码30x时)。但是,此网址会被HTML页面中包含的javascript(或meta http-equiv='refresh'标记,如果浏览器不支持javascript)重定向。因此,您必须解析HTML页面内容并从中读取URL。以下代码使用HtmlAgilityPack库(可用作NuGet包)来解析HTML页面并读取所需的URL:

string link = "https://finance.yahoo.com/r/4951f719-c8e1-3b1d-b4db-684ef6739b8e/iphone-x-5-reasons-to-wait?utm_source=yahoo&utm_medium=partner&utm_campaign=yahootix&partner=yahootix&yptr=yahoo&.tsrc=rss";
HttpWebRequest myHttpWebRequest = (HttpWebRequest)WebRequest.Create(link);
HttpWebResponse myHttpWebResponse = (HttpWebResponse)myHttpWebRequest.GetResponse();

using (var responseStream = myHttpWebResponse.GetResponseStream())
{
    using (var sr = new StreamReader(responseStream))
    {
        //content of HTML page
        var html = sr.ReadToEnd();

        //using HTMLAgilityPack library to parse HTML page
        var htmlDocument = new HtmlAgilityPack.HtmlDocument();
        htmlDocument.LoadHtml(html);

        //find URL in HTML page. Null-checks omitted for simplicity
        var contentAttribute = htmlDocument.DocumentNode.SelectSingleNode("noscript/meta").Attributes["content"].Value;
        var URL = contentAttribute.Split(';')[1];//skip "0;" in contentAttribute
        URL = URL.Substring(4);//skip 'URL='
        URL = URL.Trim('\'');//remove surrounding '
    }
}