我使用多处理和ghost.py从互联网抓取一些数据,但是有一些错误:
2015-03-31T23:22:30 QT: QWaitCondition: Destroyed while threads are still waiting
这是我的一些代码:
l.acquire()
global ghost
try:
ghost = Ghost(wait_timeout=60)
ghost.open(website) #download page
ghost.wait_for_selector('#pagenum') #wait JS
html = []
#print u"\t\t the first page"
html.append(ghost.content)
pageSum = findPageSum(ghost.content)
for i in xrange(pageSum-1): #crawl all pages
#print u"\t\tthe"+ str(i+2) +"page"
ghost.set_field_value('#pagenum', str(i+2))
ghost.click('#page-go')
ghost.wait_for_text("<td>"+str(20*(i+1)+1)+"</td>")
html.append(ghost.content)
for i in html:
souped(i)
print website, "\t\t OK!"
except :
pass
l.release()
其他代码:
global _use_line
q = Queue.Queue(0)
for i in xrange(len(websitelist)):
q.put((websitelist[i]))
lock = Lock()
while (not q.empty()):
if (_use_line > 0):
for i in range(_use_line):
dl = q.get()
_use_line -= 1
print "_use_line: ", _use_line
p = Process(target=download, args=(lock,dl))
p.start()
else:
time.sleep(1)
ghost.py使用pyqt和pyside,我认为这个问题是因为一些局部变量的错误,但我不知道如何找到它。