我有一项家庭作业,其中包括在Python中实现代理缓存服务器。我的想法是将我访问的网页写入本地计算机上的临时文件,然后在存储请求时访问它们。现在代码看起来像这样:
from socket import *
import sys
def main():
#Create a server socket, bind it to a port and start listening
tcpSerSock = socket(AF_INET, SOCK_STREAM) #Initializing socket
tcpSerSock.bind(("", 8030)) #Binding socket to port
tcpSerSock.listen(5) #Listening for page requests
while True:
#Start receiving data from the client
print 'Ready to serve...'
tcpCliSock, addr = tcpSerSock.accept()
print 'Received a connection from:', addr
message = tcpCliSock.recv(1024)
print message
#Extract the filename from the given message
print message.split()[1]
filename = message.split()[1].partition("/")[2]
print filename
fileExist = "false"
filetouse = "/" + filename
print filetouse
try: #Check whether the file exists in the cache
f = open(filetouse[1:], "r")
outputdata = f.readlines()
fileExist = "true"
#ProxyServer finds a cache hit and generates a response message
tcpCliSock.send("HTTP/1.0 200 OK\r\n")
tcpCliSock.send("Content-Type:text/html\r\n")
for data in outputdata:
tcpCliSock.send(data)
print 'Read from cache'
except IOError: #Error handling for file not found in cache
if fileExist == "false":
c = socket(AF_INET, SOCK_STREAM) #Create a socket on the proxyserver
hostn = filename.replace("www.","",1)
print hostn
try:
c.connect((hostn, 80)) #https://docs.python.org/2/library/socket.html
# Create a temporary file on this socket and ask port 80 for
# the file requested by the client
fileobj = c.makefile('r', 0)
fileobj.write("GET " + "http://" + filename + "HTTP/1.0\r\n")
# Read the response into buffer
buffr = fileobj.readlines()
# Create a new file in the cache for the requested file.
# Also send the response in the buffer to client socket and the
# corresponding file in the cache
tmpFile = open(filename,"wb")
for data in buffr:
tmpFile.write(data)
tcpCliSock.send(data)
except:
print "Illegal request"
else: #File not found
print "404: File Not Found"
tcpCliSock.close() #Close the client and the server sockets
main()
要测试我的代码,我在localhost上运行代理缓存并相应地设置我的浏览器代理设置
但是,当我运行此代码并尝试使用Chrome访问Google时,我正在问一个错误页面,上面写着err_empty_response。
使用调试器逐步执行代码使我意识到它在此行上失败
c.connect((hostn, 80))
我不明白为什么。任何帮助将不胜感激。
P.S。我正在使用Google Chrome,Python 2.7和Windows 10进行测试
答案 0 :(得分:0)
您无法在连接上使用名称。 Connect需要连接的IP地址。
您可以使用getaddrinfo()
获取构建连接所需的套接字信息。在我的pure-python-whois
包中,我使用以下代码创建连接:
def _openconn(self, server, timeout, port=None):
port = port if port else 'nicname'
try:
for srv in socket.getaddrinfo(server, port, socket.AF_UNSPEC, socket.SOCK_STREAM, 0, socket.AI_ADDRCONFIG):
af, socktype, proto, _, sa = srv
try:
c = socket.socket(af, socktype, proto)
except socket.error:
c = None
continue
try:
if self.source_addr:
c.bind(self.source_addr)
c.settimeout(timeout)
c.connect(sa)
except socket.error:
c.close()
c = None
continue
break
except socket.gaierror:
return False
return c
请注意,这不是很好的代码,因为循环实际上是无用的,而不是使用不同的替代方案。您应该只在建立连接后中断循环。但是,这应该可以作为使用getaddrinfo()
编辑:
您也没有正确清理主机名。当我尝试访问显然无法解决的/www.example.com/
时,我得到http://www.example.com/
。我建议您使用正则表达式来获取缓存的文件名。