C#HtmlAgilityPack Html加载外部

时间:2014-12-07 00:32:12

标签: c# html-agility-pack

我使用HtmlAgilityPack从网站获取html,来自该网站的请求是由XMLHttpRequest发出的,并且html是在DIV中加载的,我无法通过我尝试的请求获取外部html但我无法获得HTML

                HttpWebRequest getRequest = WebRequest.Create(Url) as HttpWebRequest;
            //
            getRequest.CookieContainer = cookieJar;
            getRequest.Method = WebRequestMethods.Http.Post;
            getRequest.UserAgent = "Mozilla/5.0 (Windows NT 6.1; rv:33.0) Gecko/20100101 Firefox/33.0";
            getRequest.AllowWriteStreamBuffering = true;
            getRequest.ProtocolVersion = HttpVersion.Version11;
            getRequest.AllowAutoRedirect = true;
            getRequest.ContentType = "application/x-www-form-urlencoded";

            Stream newStream1 = getRequest.GetRequestStream();
            newStream1.Close();
            HttpWebResponse getRequestResponse = (HttpWebResponse)getRequest.GetResponse();
            string source = "";

            using (StreamReader sr = new StreamReader(getRequestResponse.GetResponseStream(), Encoding.Default))
            {
                source = sr.ReadToEnd();
                //Console.WriteLine(source);
            }
            doc.LoadHtml(source);
            getRequestResponse.Close();

1 个答案:

答案 0 :(得分:-1)

我建议您更新代码以使用HttpClient类。这支持异步行为,这有助于提高应用程序的可伸缩性(在某些情况下)。

MSDN HttpClient Docs

但是,查看当前的实现,如果您收到405错误,可能是因为您尝试POST到URL。由于我在您显示的代码段中看不到POST数据,因此最好使用GET请求:

getRequest.Method = WebRequestMethods.Http.Get;