我正在使用标准的HttpWebRequest获取html页面:
using System;
using System.IO;
using System.Net;
namespace TestConsole.Classes {
class RequestHeadersOrder {
public void Test() {
var req = (HttpWebRequest)WebRequest.Create("https://www.google.com");
req.Proxy = new WebProxy("localhost", 8888); // for debug in Fiddler proxy (https://www.telerik.com/fiddler)
req.Host = "www.google.com";
req.UserAgent = "Robot-tester";
req.Accept = "*/*";
req.Headers.Add(HttpRequestHeader.AcceptLanguage, "en-US");
req.Headers.Add(HttpRequestHeader.AcceptEncoding, "gzip, deflate");
req.AutomaticDecompression = DecompressionMethods.Deflate | DecompressionMethods.GZip;
string html;
using (var resp = req.GetResponse())
using (var respStream = resp.GetResponseStream())
using (var ms = new MemoryStream()) {
respStream.CopyTo(ms);
html = System.Text.Encoding.UTF8.GetString(ms.ToArray());
}
Console.WriteLine(html);
}
}
}
这将产生请求标头(原始格式):
GET https://www.google.com/ HTTP/1.1
User-Agent: Robot-tester
Accept: */*
Accept-Language: en-US
Accept-Encoding: gzip, deflate
Host: www.google.com
Connection: Keep-Alive
但是大多数浏览器使用不同的标题顺序:首先是“主机”标题,然后是“用户代理”,然后是其他标题。所以,我需要这个:
GET https://www.google.com/ HTTP/1.1
Host: www.google.com
User-Agent: Robot-tester
Accept: */*
Accept-Language: en-US
Accept-Encoding: gzip, deflate
Connection: Keep-Alive
请告诉我如何首先放置“主机”标题。这对于某些网站很重要。