在一个线程中接收套接字数据,将数据写入另一个线程 - python

时间:2016-07-06 19:14:17

标签: python multithreading sockets io

我目前正在编写一个Python程序来接收来自TCP / UDP套接字的数据,然后将数据写入文件。现在,我的程序是通过将每个数据报写入文件来限制I / O(我为非常大的文件执行此操作,因此减速很大)。考虑到这一点,我决定尝试在一个线程中尝试从套接字接收数据,然后在不同的线程中写入该数据。到目前为止,我已经提出了以下草案。目前,它只将一个数据块(512字节)写入文件。

f = open("t1.txt","wb")
def write_to_file(data):
    f.write(data)

def recv_data():
    dataChunk, addr = sock.recvfrom(buf) #THIS IS THE DATA THAT GETS WRITTEN
    try:
        w = threading.Thread(target = write_to_file, args = (dataChunk,))
        threads.append(w)
        w.start()
        while(dataChunk):
            sock.settimeout(4)
            dataChunk,addr = sock.recvfrom(buf)
    except socket.timeout:
        print "Timeout"
        sock.close()
        f.close()

threads = []
r = threading.Thread(target=recv_data)
threads.append(r)
r.start()

我想我做错了什么,我只是不确定使用线程的最佳方法是什么。现在,我的问题是我必须在创建线程时提供参数,但该参数的值没有正确更改以反映进来的新数据块。但是,如果我放行{在w=threading.Thread(target=write_to_file, arg=(dataChunk,))循环中{1}},我不会在每次迭代中创建一个新线程吗?

此外,对于它的价值,这只是我使用单独的接收和写入线程的小概念验证。这不是最终应该使用这个概念的更大的程序。

1 个答案:

答案 0 :(得分:1)

您需要有一个读取线程写入的缓冲区,并且写入线程从中读取。 deque from the collections module是完美的,因为它允许来自任何一方的追加/弹出而不会降低性能。

所以,不要将dataChunk传递给你的线程,而是缓冲区。

import collections  # for the buffer
import time  # to ease polling
import threading 

def write_to_file(path, buffer, terminate_signal):
    with open(path, 'wb') as out_file:  # close file automatically on exit
      while not terminate_signal.is_set() or buffer:  # go on until end is signaled
        try:
          data = buffer.pop()  # pop from RIGHT end of buffer
        except IndexError:
          time.sleep(0.5)  # wait for new data
        else:
          out_file.write(data)  # write a chunk

def read_from_socket(sock, buffer, terminate_signal):
    sock.settimeout(4)
    try:
      while True:
        data, _ = sock.recvfrom(buf)
        buffer.appendleft(data)  # append to LEFT of buffer
    except socket.timeout:
      print "Timeout"
      terminate_signal.set()  # signal writer that we are done
      sock.close()

buffer = collections.deque()  # buffer for reading/writing
terminate_signal = threading.Event()  # shared signal
threads = [
  threading.Thread(target=read_from_socket, kwargs=dict(
    sock=sock,
    buffer=buffer,
    terminate_signal=terminate_signal
  )),
  threading.Thread(target= write_to_file, kwargs=dict(
    path="t1.txt",
    buffer=buffer,
    terminate_signal=terminate_signal
  ))
]
for t in threads:  # start both threads
  t.start()
for t in threads:  # wait for both threads to finish
  t.join()