Question

如何在c＃中获取给定网址的HTML源代码？

Answer 1

您可以使用WebClient class下载文件：

using System.Net;

using (WebClient client = new WebClient ()) // WebClient class inherits IDisposable
{
    client.DownloadFile("http://yoursite.com/page.html", @"C:\localfile.html");

    // Or you can get the file content without saving it
    string htmlCode = client.DownloadString("http://yoursite.com/page.html");
}

Answer 2

基本上是：

using System.Net;
using System.Net.Http;  // in LINQPad, also add a reference to System.Net.Http.dll

WebRequest req = HttpWebRequest.Create("http://google.com");
req.Method = "GET";

string source;
using (StreamReader reader = new StreamReader(req.GetResponse().GetResponseStream()))
{
    source = reader.ReadToEnd();
}

Console.WriteLine(source);

Answer 3

你可以通过以下方式获得：

var html = new System.Net.WebClient().DownloadString(siteUrl)

Answer 4

这篇文章真的很老了（我回答它的时候已经7岁了），所以其他任何一种解决方案都没有采用新的推荐方式，即new JSONArray("http://192.168.1.23/get_people.php")类。

HttpClient被视为新API，应该替换旧的API（HttpClient和WebClient）

WebRequest

有关如何使用string url = "page url"; using (HttpClient client = new HttpClient()) { using (HttpResponseMessage response = client.GetAsync(url).Result) { using (HttpContent content = response.Content) { string result = content.ReadAsStringAsync().Result; } } }类的更多信息（特别是在异步情况下），您可以参考this question

Answer 5

@cms方式是最新的，在MS网站上建议，但是我有一个难以解决的问题，这两个方法都贴在这里，现在我发布了所有的解决方案！

<强>问题：如果您使用这样的网址：www.somesite.it/?p=1500在某些情况下您会收到内部服务器错误（500），虽然在网络浏览器中这个www.somesite.it/?p=1500完美无缺。

<强>溶液：你必须移出参数，工作代码是：

using System.Net;
//...
using (WebClient client = new WebClient ()) 
{
    client.QueryString.Add("p", "1500"); //add parameters
    string htmlCode = client.DownloadString("www.somesite.it");
    //...
}

here official documentation

如何在C＃中下载HTML源代码？

5 个答案: