为什么我的多处理代码泄漏了?

时间:2016-03-07 12:00:52

标签: python multithreading multiprocessing

我创建了一个执行流程,它生成了几个进程,然后生成了多个Threads。

我认为有些问题可能是:

  • 我没有正确加入队列。我在这个过程中加入一次,从线程中获取结果,然后在外部获取进程。但是,我没有加入子流程。

  • 我没有使用process.daemon = True

  • 我在某个地方错过了休息等等。

然而,我注意到,当我经历循环时,一些过程仍然存在并且没有关闭,似乎无法找出原因。

首先,我创建了两个队列来保存我的request-urls,然后是来自它们的响应。然后,我启动我的进程,然后创建一个urllib3.HTTPConnectionPool,然后创建线程(使用该线程安全的连接池)。

问题在于某些流程并未完全关闭,我无法弄清楚原因。

from threading import Thread
from urllib3 import HTTPConnectionPool
from multiprocessing import Process, cpu_count, Queue, JoinableQueue, Event


class Consumer(Thread):
    def __init__(self, qin, qout, conn_pool):
        Thread.__init__(self)
        self.__qin = qin
        self.__qout = qout
        self.__connpool = conn_pool

    def run(self):
        # Close once queue empty (otherwise process will linger)
        while not self.__qin.empty():
            msg = self.__qin.get()
            ul, qid = msg
            try:
                response = self.__connpool.request('GET', ul)
                s = float(response.status)
                if s == 200:
                    json_geocode = json.loads(response.data.decode('utf-8'))
                    tot_time_s = json_geocode['paths'][0]['time']
                    tot_dist_m = json_geocode['paths'][0]['distance']
                    out = [qid, s, tot_time_s, tot_dist_m]
                elif s == 400:
                    #print("Done but no route for row: ", qid)
                    out = [qid, 999, 0, 0]
                else:
                    print("Done but unknown error for: ", s)
                    out = [qid, 999, 0, 0]
            except Exception as err:
                print(err)
                out = [qid, 999, 0, 0]
            #print(out)
            self.__qout.put(out)
            self.__qin.task_done()
            return


class Worker(Process):
    def __init__(self, qin, qout, *args, **kwargs):
        super(Worker, self).__init__(*args, **kwargs)
        self._qin = qin
        self._qout = qout
        self.exit = Event()

    def run(self):
        # Create thread-safe connection pool
        concurrent = 10
        with HTTPConnectionPool(host=ghost, port=gport, maxsize=concurrent) as conn_pool:
            num_threads = concurrent
            # Start threads (concurrent) per process
            for _ in range(num_threads):
                Consumer(self._qin, self._qout, conn_pool).start()
            # Block until all urls in self._qin are processed
            self._qin.join()
        return

if __name__ == '__main__':
    for _ in range(100):
        # Fill queue input
        qin = JoinableQueue()
        for url_q in url_routes:
            qin.put(url_q) 
        # Queue to collect output
        qout = Queue()
        # Start cpu_count number of processes (which will launch threads and sessions)
        for __ in range(cpu_count())
            Worker(qin, qout).start()
        # Block until all urls in qin are processed
        qin.join()
        # Fill routes
        calc_routes = []
        while not qout.empty():
            calc_routes.append(qout.get())

0 个答案:

没有答案