http代理缓存服务器不限制浏览器功能

时间:2017-11-03 19:24:53

标签: python http caching https proxy

我正在尝试使用此代码来创建HTTP代理缓存服务器。当我运行代码时,它开始运行并连接到端口和所有内容,但是当我尝试从浏览器连接时,例如,如果我输入localhost:52523/www.google.com它会在55555上打开一个端口它工作正常但是当我尝试其他网站特别是HTTP,例如localhost:52523/www.microcenter.com或只是localhost:52523/google.com它会显示localhost没有发送任何数据。

ERR_EMPTY_RESPONSE并在控制台中显示异常,但它会在我的计算机上创建缓存文件。

我想了解如何编辑代码,以便我可以访问任何网站,就像我通常在浏览器上一样,而不使用代理服务器。它应该能够与www.microcenter.com

一起使用
import socket
import sys
import urllib
from urlparse import urlparse
Serv_Sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) # socket.socket 
function creates a socket.
port = Serv_Sock.getsockname()[1]
# Server socket created, bound and starting to listen
Serv_Sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) # socket.socket 
function creates a socket.
Serv_Sock.bind(('',port))
Serv_Sock.listen(5)
port = Serv_Sock.getsockname()[1]
# Prepare a server socket
print ("starting server on port %s...,"%(port)) 



def caching_object(splitMessage, Cli_Sock):
    #this method is responsible for caching
    Req_Type = splitMessage[0]
    Req_path = splitMessage[1]
    Req_path = Req_path[1:]
    print "Request is ", Req_Type, " to URL : ", Req_path

    #Searching available cache if file exists
    url = urlparse(Req_path)
    file_to_use = "/" + Req_path
    print file_to_use
    try:
        file = open(file_to_use[5:], "r")
        data = file.readlines()
        print "File Present in Cache\n"

        #Proxy Server Will Send A Response Message
        #Cli_Sock.send("HTTP/1.0 200 OK\r\n")
        #Cli_Sock.send("Content-Type:text/html")
        #Cli_Sock.send("\r\n")

        #Proxy Server Will Send Data
        for i in range(0, len(data)):
            print (data[i])
            Cli_Sock.send(data[i])
        print "Reading file from cache\n"

    except IOError:
        print "File Doesn't Exists In Cache\n fetching file from server \n 
creating cache"
        serv_proxy = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        host_name = Req_path
        print "HOST NAME:", host_name
        try:
            serv_proxy.connect((url.host_name, 80))
             print 'Socket connected to port 80 of the host'
            fileobj = serv_proxy.makefile('r', 0)
            fileobj.write("GET " + "http://" + Req_path + " HTTP/1.0\n\n")

            # Read the response into buffer
            buffer = fileobj.readlines()

            # Create a new file in the cache for the requested file.
            # Also send the response in the buffer to client socket
            # and the corresponding file in the cache
            tmpFile = open(file_to_use, "wb")
            for data in buffer:
                        tmpFile.write(data)
                        tcpCliSock.send(data)
        except:
            print 'Illegal Request'

    Cli_Sock.close()
while True:
    # Start receiving data from the client
    print 'Initiating server... \n Accepting connection\n'
    Cli_Sock, addr = Serv_Sock.accept() # Accept a connection from client
    #print addr

    print ' connection received from: ', addr
    message = Cli_Sock.recv(1024) #Recieves data from Socket

    splitMessage = message.split()
    if len(splitMessage) <= 1:
        continue

    caching_object(splitMessage, Cli_Sock)

1 个答案:

答案 0 :(得分:0)

代码中有一些错误: -

第一个是GET请求不希望协议作为调用的一部分传入,也不期望主机,而GET应仅限于路径+查询字符串。

应添加一个额外的HOST标题,指定您使用的主机(即www.google.com)某些网络服务器可能会设置为忽略此状态,而是向您发送默认页面,但结果是间歇性的。

您应该看一下HTTP RFC,它会提供一些可以通过HTTP传递的其他标题。

您还可以安装FiddlerWireshark之类的内容,并监控一些示例HTTP调用,并查看有效负载的外观。