如何通过c#获取html页面源代码

时间:2017-01-21 10:10:23

标签: c#

我想从urlurl .htm public StreamReader Fn_DownloadWebPageComplete(string link_Pagesource) { //--------- Download Complete ------------------ // using (WebClient client = new WebClient()) // WebClient class inherits IDisposable // { //client //HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(link_Pagesource); //webRequest.AllowAutoRedirect = true; //var client1 = (System.Net.HttpWebRequest)System.Net.WebRequest.Create(link_Pagesource); //client1.CookieContainer = new System.Net.CookieContainer(); // client.DownloadFile(link_Pagesource, @"D:\S1.htm"); // } //--------- Download Page Source ------------------ HttpWebRequest URL_pageSource = (HttpWebRequest)WebRequest.Create("https://www.digikala.com"); URL_pageSource.Timeout = 360000; //URL_pageSource.Timeout = 1000000; URL_pageSource.ReadWriteTimeout = 360000; // URL_pageSource.ReadWriteTimeout = 1000000; URL_pageSource.AllowAutoRedirect = true; URL_pageSource.MaximumAutomaticRedirections = 300; using (WebResponse MyResponse_PageSource = URL_pageSource.GetResponse()) { str_PageSource = new StreamReader(MyResponse_PageSource.GetResponseStream(), System.Text.Encoding.UTF8); pagesource1 = str_PageSource.ReadToEnd(); success = true; } 保存本地驱动器中的完整网页asp,但我没有成功。

代码

.carousel-inner{position:relative;}
.carousel-inner > .item > a{position:absolute; top:50%; margin-top:-20px; display:block;}
.carousel-inner > .item > a.prev{left:20px;}
.carousel-inner > .item > a.next{right:20px;}

错误:

  

尝试了太多的自动重定向。

尝试此代码但未成功。

许多网址使用此代码成功,但此网址未成功。

3 个答案:

答案 0 :(得分:4)

这是方式

    string url = "https://www.digikala.com/";

    using (HttpClient client = new HttpClient())
    {
        using (HttpResponseMessage response = client.GetAsync(url).Result)
        {
            using (HttpContent content = response.Content)
            {
                string result = content.ReadAsStringAsync().Result;
            }
        }
    }

result变量将包含HTML页面,然后您可以将其保存到这样的文件中

System.IO.File.WriteAllText("path/filename.html", result);

注意您必须使用命名空间

using System.Net.Http;

更新如果您使用的是旧版VS,那么您可以看到此answer使用WebClientWebRequest用于相同目的,但实际更新您的VS是一个更好的解决方案。

答案 1 :(得分:1)

using (WebClient client = new WebClient ())
{
    string htmlCode = client.DownloadString("https://www.digikala.com");
}

答案 2 :(得分:1)

using (WebClient client = new WebClient ())
{
    client.DownloadFile("https://www.digikala.com", @"C:\localfile.html");
}