我正在使用套接字模块在python中创建一个服务器/客户端应用程序,无论出于什么原因,我的服务器都会不断结束连接。奇怪的是,这在Windows中可以完美运行,但不适用于Linux。我到处寻找一个可能的解决方案,但没有一个工作。下面是利用该错误的代码的清理版本,但成功率更高。通常它永远不会起作用。希望这仍然是足够的信息。谢谢!
服务器:
import logging
import socket
import threading
import time
def getData():
HOST = "localhost"
PORT = 5454
while True:
s = socket.socket( socket.AF_INET, socket.SOCK_STREAM )
s.setsockopt( socket.SOL_SOCKET, socket.SO_REUSEADDR, 1 ) #because linux doesn't like reusing addresses by default
s.bind( ( HOST, PORT ) )
logging.debug( "Server listens" )
s.listen( 5 )
conn, addr = s.accept()
logging.debug( "Client connects" )
print "Connected by,", addr
dataRequest = conn.recv( 1024 )
logging.debug( "Server received message" )
time.sleep( .01 ) #usually won't have to sample this fast
data = """Here is some data that is approximately the length
of the data that I am sending in my real server. It is a string that
doesn't contain any unordinary characters except for maybe a tab."""
if not timeThread.isAlive(): #lets client know test is over
data = "\t".join( [ data, "Terminate" ] )
conn.send( data )
s.close()
print "Finished"
print "Press Ctrl-C to quit"
break
else:
logging.debug( "Server sends data back to client" )
conn.send( data )
logging.debug( "Server closes socket" )
s.close()
def timer( t ):
start = time.time()
while ( time.time() - start ) < t:
time.sleep( .4 )
#sets flag for another thread not here
def main():
global timeThread
logging.basicConfig( filename="test.log", level=logging.DEBUG )
#time script runs for
t = 10 #usually much longer (hours)
timeThread = threading.Thread( target=timer, args=( t, ) )
dataThread = threading.Thread( target=getData, args=() )
timeThread.start()
dataThread.start()
#just for testing so I can quit threads when sockets break
while True:
time.sleep( .1 )
timeThread.join()
dataThread.join()
if __name__ == "__main__":
main()
客户端:
import logging
import socket
def getData():
dataList = []
termStr = "Terminate"
data = sendDataRequest()
while termStr not in data:
dataList.append( data )
data = sendDataRequest()
dataList.append( data[ :-len( termStr )-1 ] )
def sendDataRequest():
HOST = "localhost"
PORT = 5454
s = socket.socket( socket.AF_INET, socket.SOCK_STREAM )
while True:
try:
s.connect( ( HOST, PORT ) )
break
except socket.error:
print "Connecting to server..."
logging.debug( "Client sending message" )
s.send( "Hey buddy, I need some data" ) #approximate length
try:
logging.debug( "Client starts reading from socket" )
data = s.recv( 1024 )
logging.debug( "Client done reading" )
except socket.error, e:
logging.debug( "Client throws error: %s", e )
print data
logging.debug( "Client closes socket" )
s.close()
return data
def main():
logging.basicConfig( filename="test.log", level=logging.DEBUG )
getData()
if __name__ == "__main__":
main()
编辑:添加追溯
Traceback (most recent call last):
File "client.py", line 39, in <moduel>
main()
File "client.py", line 36, in main
getData()
File "client.py", line 10, in getData
data = sendDataRequest()
File "client.py", line 28, in sendDataRequest
data = s.recv( 1024 )
socket.error: [Errno 104] Connection reset by peer
编辑:添加了调试
DEBUG:root:Server listens
DEBUG:root:Client sending message
DEBUG:root:Client connects
DEBUG:root:Client starts reading from socket
DEBUG:root:Server received message
DEBUG:root:Server sends data back to client
DEBUG:root:Server closes socket
DEBUG:root:Client done reading
DEBUG:root:Server listens
DEBUG:root:Client sending message
DEBUG:root:Client connects
DEBUG:root:Client starts reading from socket
DEBUG:root:Server received message
DEBUG:root:Server sends data back to client
DEBUG:root:Client done reading
DEBUG:root:Client sending message
DEBUG:root:Client starts reading from socket
DEBUG:root:Server closes socket
DEBUG:root:Client throws error: [Errno 104] Connection reset by peer
DEBUG:root:Server listens
汤姆的理论似乎是正确的。我会试着弄清楚如何更好地关闭连接。
这没有解决,但接受的答案似乎指出了问题。
编辑:我尝试使用Tom的getData()函数,看起来服务器仍然过早关闭连接。应该是可重复的,因为我无法在Windows中使用它。
服务器输出/回溯:
Connected by, ('127.0.0.1', 51953)
Exception in thread Thread-2:
Traceback (most recent call last):
File "/usr/lib64/python2.6/threading.py", line 532, in __bootstrap_inner
self.run()
File "/usr/lib64/python2.6/threading.py", line 484, in run
self.__target(*self.__args, **self.__kwargs)
File "server.py", line 15, in getData
s.bind( ( HOST, PORT ) )
File "<string>", line 1, in bind
error: [Errno 22] Invalid argument
客户端输出/回溯:
Here is some data that is approximately the length
of the data that I am sending in my real server. It is a string that
doesn't contain any unordinary characters except for maybe a tab.
Traceback (most recent call last):
File "client.py", line 49, in <moduel>
main()
File "client.py", line 46, in main
getData()
File "client.py", line 11, in getData
data = sendDataRequest()
File "client.py", line 37, in sendDataRequest
print data
UnboundLocalError: local variable 'data' referenced before assignment
日志:
DEBUG:root:Server listens
DEBUG:root:Client sending message
DEBUG:root:Client connects
DEBUG:root:Client starts reading from socket
DEBUG:root:Server received message
DEBUG:root:Server sends data back to client
DEBUG:root:Server closes connection
DEBUG:root:Client done reading
DEBUG:root:Client closes socket
DEBUG:root:Client sending message
DEBUG:root:Client starts reading from socket
DEBUG:root:Client throws error: [Errno 104] Connection reset by peer
更新:我使用了Tom的getData()
函数,但将s.bind()
移到循环之前并使其工作。老实说,我不知道为什么这样可行,所以如果有人能解释为什么服务器关闭它的客户端套接字是安全的,而不是当它关闭它的服务器套接字时会很酷。谢谢!
答案 0 :(得分:5)
虽然我无法重现此问题(在Windows 7 64位,Python 2.7上),但我最好的猜测是发生了以下情况:
您从客户端添加的堆栈跟踪似乎支持这一理论。是否有可能证明某些额外的记录或类似情况并非如此?
其他注意事项: 如果您的客户端在它收到的第一个数据中找不到终止字符串,它会向服务器打开一个新的套接字。这看起来不对我 - 你应该从同一个套接字读取数据,直到你拥有它为止。
编辑:结合更多事情:
在您的示例日志输出中,您还没有更新代码,因此我无法查看每个日志行的来源。但是,看起来很可疑,就像你有两个并行运行的客户端(可能在不同的进程或线程中?),这导致:
我刚注意到最后一件事。在这里的示例https://docs.python.org/2/library/socket.html#example中,服务器不关闭套接字,它会关闭从侦听套接字生成的连接。可能是您有2个客户端连接到同一服务器套接字实例,当您关闭服务器套接字时,实际上是断开连接的两个客户端,而不仅仅是第一个。如果您正在运行多个客户端,则记录某种身份Eg。 DEBUG:root:Client(6) done reading
可能有助于证明这一点。
您是否可以针对服务器的数据线程主循环尝试以下操作,将显示问题是否与关闭侦听套接字而非连接套接字有关:
def getData():
HOST = "localhost"
PORT = 5454
s = socket.socket( socket.AF_INET, socket.SOCK_STREAM )
# s.setsockopt( socket.SOL_SOCKET, socket.SO_REUSEADDR, 1 ) #because linux doesn't like reusing addresses by default
s.bind( ( HOST, PORT ) )
logging.debug( "Server listens" )
s.listen( 5 )
while True:
conn, addr = s.accept()
logging.debug( "Client connects" )
print "Connected by,", addr
dataRequest = conn.recv( 1024 )
logging.debug( "Server received message" )
time.sleep( .01 ) #usually won't have to sample this fast
data = """Here is some data that is approximately the length
of the data that I am sending in my real server. It is a string that
doesn't contain any unordinary characters except for maybe a tab."""
if not timeThread.isAlive(): #lets client know test is over
data = "\t".join( [ data, "Terminate" ] )
conn.send( data )
conn.close()
print "Finished"
print "Press Ctrl-C to quit"
break
else:
logging.debug( "Server sends data back to client" )
conn.send( data )
logging.debug( "Server closes connection" )
conn.close()
答案 1 :(得分:3)
我离开了我的深度,但是在研究一个可能相关的问题(在Linux上间歇性地“通过对等方重置连接”错误,在Windows上正常工作),我遇到了http://scie.nti.st/2008/3/14/amazon-s3-and-connection-reset-by-peer/。我们那里有用的调试器Garry Dolley总结了(2008年!):
“Linux内核2.6.17+增加了TCP窗口/缓冲区的最大大小,如果它无法处理足够大的TCP窗口,这开始导致其他设备退出。齿轮会重置连接,我们将此视为“通过对等方重置连接”消息。“
他提供了一个涉及/etc/sysctl.conf的解决方案。我还没有尝试过,但可能值得一看?
答案 2 :(得分:0)
我遇到了类似的问题,我在发送端通过对等方重置连接。原来这是因为在接收方的某个地方抛出了异常。因此,当脚本意外结束时,操作系统将只是在该套接字上 RST 连接。这是一个相当古老的线程,但对于任何遇到类似线程问题的人,我的建议是:在尝试使其变得复杂之前,确保它在单线程中工作。