使用Socket的HTTP代理

时间:2011-07-30 06:36:17

标签: c# .net windows sockets proxy

我需要下载网页内容,包括

等完整标题
HTTP/1.1 200 OK
Cache-Control: private, max-age=0
Content-Type: text/html; charset=utf-8
Expires: Sat, 30 Jul 2011 06:19:13 GMT
P3P: CP="NON UNI COM NAV STA LOC CURa DEVa PSAa PSDa OUR IND"
Date: Sat, 30 Jul 2011 06:20:13 GMT
Transfer-Encoding:  chunked
Connection: keep-alive
Connection: Transfer-Encoding
Set-Cookie: _SS=SID=0B3A2FD5AA7943BC92252BB73BD7C9CA; domain=.bing.com; path=/
Set-Cookie: MUID=CE6F495249204D82A8F620B7317FC59E; expires=Mon, 29-Jul-2013 06:20:13 GMT; domain=.bing.com; path=/
Set-Cookie: OrigMUID=CE6F495249204D82A8F620B7317FC59E%2c95e9e1eafdef40d6a24497335843fac6; expires=Mon, 29-Jul-2013 06:20:13 GMT; domain=.bing.com; path=/
Set-Cookie: OVR=flt=0&flt2=0&flt3=0&flt4=0&flt5=0&flt6=0&flt7=0&flt8=0&flt9=0&flt10=0&flt11=0&ramp1=snrport4-release&release=or3&preallocation=0&R=1; domain=.bing.com; path=/
Set-Cookie: SRCHD=D=1881020&MS=1881020&AF=QBLH; expires=Mon, 29-Jul-2013 06:20:13 GMT; domain=.bing.com; path=/
Set-Cookie: SRCHUID=V=2&GUID=A2EAC1B8990D46619C897016C94B5C4B; expires=Mon, 29-Jul-2013 06:20:13 GMT; path=/
Set-Cookie: SRCHUSR=AUTOREDIR=0&GEOVAR=&DOB=20110730; expires=Mon, 29-Jul-2013 06:20:13 GMT; domain=.bing.com; path=/

000037E4
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"><html lang="en" xml:lang="en" xmlns="http://www.w3.org/1999/xhtml" xmlns:Web="http://schemas.live.com/Web/"><head><meta content="text/html; charset=utf-8" http-equiv="content-type" /><script type="text/javascript">//<![CDATA[

由于WebClient中没有带有完整标题的内容,HttpWebRequest我使用的是Socket,这里是代码。

using (Socket socket = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.IP))
{
    IPHostEntry entry = Dns.GetHostEntry(fullUrlAddress);
    socket.ReceiveTimeout = 3000;
    socket.Connect(entry.AddressList[0], 80);

    string request = string.Empty;
    string build_request = string.Empty;
    if (cookieJar.Count != 0)
    {
        request = "GET {0} HTTP/1.1\r\nHost: {1}\r\nUser-Agent: Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nAccept-Language: en-us,en;q=0.5\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\nConnection: keep-alive\r\nReferer: {0}\r\nCookie: {2}\r\n\r\n";
        build_request = string.Format(request, requestedUri.AbsoluteUri, requestedUri.Host, GetCookies(requestedUri));
    }
    else
    {
        request = "GET {0} HTTP/1.1\r\nHost: {1}\r\nUser-Agent: Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nAccept-Language: en-us,en;q=0.5\r\nAccept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\nConnection: keep-alive\r\nReferer: {0}\r\nCookie: {2}\r\n\r\n";
        build_request = string.Format(request, requestedUri.AbsoluteUri, requestedUri.Host, "PREF=ID=19495678a6a3dd6e:U=c5ce8e4e3f61da69:FF=0:TM=1311310634:LM=1311310636:S=gbV7hD2dPfycsf8Q; NID=49=dN3QceFFBFxwsCXM43HCRJF_oxoBpUHuUWt2tpoofEDFcRhj7TWWV4EFQNuVYP1GhyBAsQr3oOeohsJp31x8kb_iXiGcQFh1a3IFsPTNKjzJv_NgSK8ssG956PJO7jH-");
    }

    byte[] data = Encoding.UTF8.GetBytes(build_request);
    socket.Send(data, data.Length, 0);

    int bytes = 0;
    byte[] bytesReceived = new byte[10240];
    string currentBatch = string.Empty;

    try
    {
        do
        {
            bytes = socket.Receive(bytesReceived);
            currentBatch = Encoding.ASCII.GetString(bytesReceived, 0, bytes);
            responseString.Append(currentBatch);
        }
        while (bytes > 0);
    }
    catch (Exception)
    {
    }

    socket.Close();
}

它运行正常,但我不知道如何使用HTTP代理连接,Socket与WebClient不同,无法使用UserName和Password设置代理。

我的问题很简单: 如何使用HTTP代理与Socket中的凭据进行连接?

如果您有解决方案,请回复,如果您推荐webclient或其他不回复,我有充分理由使用套接字,建议开源库,链接,邀请教程。

2 个答案:

答案 0 :(得分:2)

代理的用户名和密码通过HTTP标头发送。 使用请求标头中的Proxy-Authorization字段:

Proxy-Authorization: Basic <BASE64("USER:PASS")>

如果您的任何请求得到响应“407 Proxy Authentication Required”,您可以阅读响应标头字段Proxy-Authenticate,它将告诉您授权时要使用的身份验证模式。 高于Basic(最常见的),但还有其他像DigestNTLM。您可以阅读其他两个here

答案 1 :(得分:1)

您的示例显示的是HTTP响应标头,而不是HTTP请求标头。您需要发送的HTTP请求标头是什么?

直接在套接字上执行此操作将非常非常困难,除非您进行一系列简化假设(例如,服务器永远不会使用分块编码或压缩等)。例如,如果服务器使用保持活动连接,则当前代码将不起作用。使用HTTPWebRequest并使用Reflection来调整所需的任何内部成员,你会好得多。

另一种选择是将FiddlerCore放入您的应用程序(www.fiddler2.com/core)。 FiddlerCore包含一个完整的HTTP堆栈,包括对代理,压缩,分块编码等的支持。