Question

我试图在python中使用Queue，这将是多线程的。我只是想知道我使用的方法是否正确。如果我做了多余的事情，或者如果有更好的方法我应该使用。

我正在尝试从表中获取新请求，并使用某些逻辑来安排它们执行某些操作，例如运行查询。

所以在主线程中，我为队列生成了一个单独的线程。

if __name__=='__main__':

  request_queue = SetQueue(maxsize=-1)
  worker = Thread(target=request_queue.process_queue)
  worker.setDaemon(True)
  worker.start()


  while True:
    try:
      #Connect to the database get all the new requests to be verified
      db = Database(username_testschema, password_testschema, mother_host_testschema, mother_port_testschema, mother_sid_testschema, 0)
      #Get new requests for verification
      verify_these = db.query("SELECT JOB_ID FROM %s.table WHERE     JOB_STATUS='%s' ORDER BY JOB_ID" %
                             (username_testschema, 'INITIATED'))

      #If there are some requests to be verified, put them in the queue.
      if len(verify_these) > 0:
        for row in verify_these:
          print "verifying : %s" % row[0]
          verify_id = row[0]
          request_queue.put(verify_id)
    except Exception as e:
      logger.exception(e)
    finally:
      time.sleep(10)

现在在Setqueue类中，我有一个process_queue函数，用于处理添加到队列的每次运行中的前2个请求。

'''
Overridding the Queue class to use set as all_items instead of list to ensure unique items added and processed all the time,
'''

class SetQueue(Queue.Queue):
  def _init(self, maxsize):
    Queue.Queue._init(self, maxsize)
    self.all_items = set()

  def _put(self, item):
    if item not in self.all_items:
      Queue.Queue._put(self, item)
      self.all_items.add(item)

  '''
  The Multi threaded queue for verification process. Take the top two items, verifies them in a separate thread and sleeps for 10 sec.
  This way max two requests per run will be processed.
  '''
  def process_queue(self):
    while True:
      scheduler_obj = Scheduler()

      try:
        if self.qsize() > 0:
          for i in range(2):
            job_id = self.get()
            t = Thread(target=scheduler_obj.verify_func, args=(job_id,))
            t.start()

          for i in range(2):
            t.join(timeout=1)
            self.task_done()

      except Exception as e:
        logger.exception(
          "QUEUE EXCEPTION : Exception occured while processing requests in the VERIFICATION QUEUE")
      finally:
        time.sleep(10)

我想知道我的理解是否正确以及是否存在任何问题。

因此，在主func连接到数据库时，运行的主线程为True，获取新请求并将其放入队列中。队列的工作线程（守护程序）继续从队列中获取新请求，并且执行处理的fork非守护程序线程，并且由于连接的超时为1，工作线程将继续接收新请求而不会被阻止，并且子线程将继续在后台处理。正确的吗？

因此，如果主进程退出这些不会被杀死，直到他们完成工作但是工作守护程序线程将退出。怀疑：如果父进程是守护进程，而子进程是非守护进程，如果父进程退出，则子进程退出？）。

我也在这里阅读： - David beazley multiprocessing

由david beazley使用Pool作为线程协处理器部分，他试图解决类似的问题。我应该按照他的步骤： - 1.创建流程池。 2.打开一个像我正在为request_queue做的线程 3.在那个帖子中

  def process_verification_queue(self):
    while True:
      try:
        if self.qsize() > 0:
          job_id = self.get()
          pool.apply_async(Scheduler.verify_func, args=(job_id,))
      except Exception as e:
        logger.exception("QUEUE EXCEPTION : Exception occured while    processing requests in the VERIFICATION QUEUE")

使用池中的进程并行并行运行verify_func。这会给我带来更多表现吗？

Answer 1

虽然可以为队列创建一个新的独立线程，并按照你的方式单独处理这些数据，但我相信每个独立工作者线程更常见的是将消息发布到他们已经＃＆＃的队列中34;知＆＃34;关于。然后通过从该队列中拉出消息，从其他一些线程处理该队列。

设计理念

我调用你的应用程序的方式是三个线程。主线程和两个工作线程。 1个工作线程将从数据库获取请求并将它们放入队列中。另一个工作线程将处理队列中的数据

主线程只是使用线程函数.join（）

等待其他线程完成

您将保护线程可以访问的队列，并使用互斥锁使其成为线程安全的。我也在其他语言的许多其他设计中看到了这种模式。

建议阅读

＆＃34;有效的Python＆＃34; Brett Slatkin就是这个问题的一个很好的例子。

他不是从Queue继承，而是在他的班级中创建一个包装器调用MyQueue并添加一个get（）和put（消息）函数。

他甚至在他的Github回购中提供源代码

https://github.com/bslatkin/effectivepython/blob/master/example_code/item_39.py

我与该书或其作者没有任何关系，但我强烈推荐它，因为我从中学到了很多东西：）

Answer 2

我喜欢这个优点和解释的解释。使用线程和进程之间的差异 - “......但是有一线希望：进程可以同时在多个执行线程上取得进展。由于父进程不与其子进程共享GIL，所有进程可以同时执行（受限于硬件和操作系统）....“

他对绕过GIL以及如何提高性能有一些很好的解释

在这里阅读更多内容：

http://jeffknupp.com/blog/2013/06/30/pythons-hardest-problem-revisited/

在python中使用多线程队列的方法是否正确？

2 个答案:

设计理念

建议阅读