如何在c#中获取给定网址的HTML源代码?
答案 0 :(得分:174)
您可以使用WebClient class下载文件:
using System.Net;
using (WebClient client = new WebClient ()) // WebClient class inherits IDisposable
{
client.DownloadFile("http://yoursite.com/page.html", @"C:\localfile.html");
// Or you can get the file content without saving it
string htmlCode = client.DownloadString("http://yoursite.com/page.html");
}
答案 1 :(得分:38)
基本上是:
using System.Net;
using System.Net.Http; // in LINQPad, also add a reference to System.Net.Http.dll
WebRequest req = HttpWebRequest.Create("http://google.com");
req.Method = "GET";
string source;
using (StreamReader reader = new StreamReader(req.GetResponse().GetResponseStream()))
{
source = reader.ReadToEnd();
}
Console.WriteLine(source);
答案 2 :(得分:15)
你可以通过以下方式获得:
var html = new System.Net.WebClient().DownloadString(siteUrl)
答案 3 :(得分:12)
这篇文章真的很老了(我回答它的时候已经7岁了),所以其他任何一种解决方案都没有采用新的推荐方式,即new JSONArray("http://192.168.1.23/get_people.php")
类。
HttpClient
被视为新API,应该替换旧的API(HttpClient
和WebClient
)
WebRequest
有关如何使用string url = "page url";
using (HttpClient client = new HttpClient())
{
using (HttpResponseMessage response = client.GetAsync(url).Result)
{
using (HttpContent content = response.Content)
{
string result = content.ReadAsStringAsync().Result;
}
}
}
类的更多信息(特别是在异步情况下),您可以参考this question
答案 4 :(得分:10)
<强>问题:强>
如果您使用这样的网址:www.somesite.it/?p=1500
在某些情况下您会收到内部服务器错误(500),
虽然在网络浏览器中这个www.somesite.it/?p=1500
完美无缺。
<强>溶液:强> 你必须移出参数,工作代码是:
using System.Net;
//...
using (WebClient client = new WebClient ())
{
client.QueryString.Add("p", "1500"); //add parameters
string htmlCode = client.DownloadString("www.somesite.it");
//...
}