Python使用worker处理队列中的多个项目

时间:2015-10-29 17:19:24

标签: python

我正在尝试使用队列构建一个简单的多处理应用程序。

我开始4个流程来处理来自多个网站的数据。我希望每个进程都处理不同的网站,但由于某种原因,进程会多次运行而且永远不会退出。

from multiprocessing import Process
import Queue
import requests

def readdata(item):
    print item
    r = requests.get(item)
    print 'read data'
    print r.status_code


def worker(queue):
   while True:
       try:
           print 'start process'
           item = queue.get()
           readdata(item)
           q.task_done()
       except:
           print "the end"
           break

if __name__ == "__main__":
     nthreads = 4
     queue = Queue.Queue()
     # put stuff in the queue here 
     moreStuff = ['http://www.google.com','http://www.yahoo.com','http://www.cnn.com']
     for stuff in moreStuff:
         queue.put(stuff)
     procs = [Process(target = worker, args = (queue,)) for i in xrange(nthreads)]
     for p in procs:
       p.start()
     for p in procs:
       p.join()

输出:

    start process
http://www.google.com
start process
http://www.google.com
start process
http://www.google.com
start process
http://www.google.com
read data
200
start process
http://www.yahoo.com
read data
200
start process
http://www.yahoo.com
read data
200
start process
http://www.yahoo.com
read data
200
start process
http://www.yahoo.com
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
/usr/local/lib/python2.7/dist-packages/requests/packages/urllib3/util/ssl_.py:90: InsecurePlatformWarning: A true SSLContext object is not available. This prevents urllib3 from configuring SSL appropriately and may cause certain SSL connections to fail. For more information, see https://urllib3.readthedocs.org/en/latest/security.html#insecureplatformwarning.
  InsecurePlatformWarning
read data
200
start process
http://www.cnn.com
read data
200
start process
http://www.cnn.com
read data
200
start process
http://www.cnn.com
read data
200
start process
read data
200
start process
http://www.cnn.com
read data
200
start process
read data
200
start process
read data
200
start process

如何检查队列是否为空并退出?

1 个答案:

答案 0 :(得分:0)

queue

使用.empty()

此外,作为建议,由于您的队列没有变化,我会这样做:

while not queue.empty():  # Wait for the queue to finish
    pass

print('Queue finished')

而不是:

for p in procs:
    p.join()

或者更好地使用JoinableQueue代替:

for p in procs:
    p.start()
queue.join()