Question

我有以下代码：

    final = []
    with futures.ThreadPoolExecutor(max_workers=self.number_threads) as executor:
        _futures = [executor.submit(self.get_attribute, listing,
                                    self.proxies[listings.index(listing) % len(self.proxies)]) for listing
                    in listings]
        for result in futures.as_completed(_futures):
            try:
                listing = result.result()
                final.append(listing)
            except Exception as e:
                print traceback.format_exc()
    return final

提交给执行程序的self.get_attribute函数将字典和代理作为输入，并使一个或两个http请求获取一些数据并返回一个已编辑的字典。问题是工作人员/线程在完成所有提交的任务时都会挂起。如果我提交400个词典，它将完成~380个任务，然后挂起。如果我提交600，它将完成~570-580。但是，如果我提交25，它将完成所有这些。我不确定从完成到完成的阈值是什么。

我也尝试过使用这样的队列和线程系统：

  def _get_attribute_thread(self):
     while self.q.not_empty:
         job = self.q.get()
         listing = job['listing']
         proxy = job['proxy']
         self.threaded_results.put(self.get_attribute(listing, proxy))
         self.q.task_done()

  def _get_attributes_threaded_with_proxies(self, listings):

       for listing in listings:
           self.q.put({'listing': listing, 'proxy': self.proxies[listings.index(listing) % len(self.proxies)]})

       for _ in xrange(self.number_threads):
           thread = threading.Thread(target=self._get_attribute_thread)
           thread.daemon = True
           thread.start()

       self.q.join()

       final = []
       while self.threaded_results.not_empty:
           final.append(self.threaded_results.get())

       return final

但结果是一样的。我该怎么做才能解决/调试问题？提前谢谢。

Python挂线程

0 个答案: