python多处理。子进程在完成任务之前都会停止

时间:2015-08-11 03:59:53

标签: python web-crawler python-multiprocessing

我使用python多处理模块来测试一些代理。他们在开始时工作得很好。但是,经过几分钟后,它们变慢了。然后我检查了任务管理器,我发现只有2个子进程left.Later,所有进程都停止了,即使他们没有完成任务!

我可以问为什么T ^ T

#coding:utf-8
import urllib2
import re
import cookielib
import time
import urllib
import multiprocessing

h = {
                'Connection' : 'keep-alive' ,
                'Accept' : '*/*' ,
                'User_Agent' : 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36' ,
                }

f_r=open('ip2.txt','r')
ips=f_r.readlines()
#print rows

#print ips

q = multiprocessing.Queue()
for ip in ips:
    q.put(ip.replace('\n' , ''))

def worker(q):
    global h
    while not q.empty():
        ip = q.get()
        proxy_ip='http://'+ip
        print proxy_ip
        proxy = urllib2.ProxyHandler( { 'http' : proxy_ip } )
        cj = cookielib.CookieJar()  
        cookie_support = urllib2.HTTPCookieProcessor(cj)  
        opener = urllib2.build_opener(cookie_support, urllib2.HTTPHandler)  
        opener.add_handler( proxy )
        urllib2.install_opener(opener)
        try:
            urllib2.urlopen(urllib2.Request('http://www.zhihu.com' ,headers=h),timeout=3)
            print ip+'   OK!!!'
            with open('ip_canuse.txt','a') as f_w:
                f_w.write(ip + '\n')
            break
        except Exception,e:
            print e
            continue

if __name__ == '__main__':         
    ps=[]
    for i in range(10):
        ps.append(multiprocessing.Process(target = worker, args = (q,)))
    for p in ps:
        p.daemon = True
        p.start()
    for p in ps:
        p.join()

    print "end"

1 个答案:

答案 0 :(得分:0)

事实证明,我添加了一个不必要的“休息”'那里,这导致了这个问题。