我正在检查我的客户国家,以便知道我可以提供哪些服务... 所以问题是线程块,例如它检查15-20并阻止,我想要一个解决方案来保持它继续 代码是:
import requests
import re
from sys import argv
from Queue import Queue
from threading import Thread
e = argv[1]
emails = open(e, 'r').readlines()
emails = map(lambda s: s.strip(), emails)
valid=[]
def base(email):
xo = requests.get("http://www.paypal.com/xclick/business="+email, headers={"User-Agent":"Mozilla/5.0 (Windows NT 5.0; rv:21.0) Gecko/20100101 Firefox/21.0"}).text
x = re.search("s.eVar36=\"(.*?)\";", xo)
try:
if x.group(1) != "":
print "%s === %s" % (email,x.group(1))
w=open(str(x.group(1))+".txt", 'a')
w.write(email+"\n")
valid.append(email)
except:
pass
def work():
email=q.get()
base(email)
q.task_done()
THREADS = 25
q=Queue()
for i in range(THREADS):
t=Thread(target=work())
t.daemon=True
t.start()
if (len(argv)>0):
for email in emails:
q.put(email)
q.join()`enter code here
提前致谢
答案 0 :(得分:1)
您的问题是您在创建线程时调用work()
而不是传递work
函数。不要在代码中进行更改,而是考虑移动python的ThreadPool
,它会为您完成繁重的任务。这是一个实现你想要的例子。
map
为您的worker调用迭代器中的每个电子邮件,并将worker的结果作为迭代器(python 3)或list(python 2)返回。您的工作人员会为其给出的每封电子邮件返回有效的电子邮件或“无”,因此您只需在最后过滤掉Nones。
import requests
import re
from sys import argv
import multiprocessing.pool
e = argv[1]
emails = [line.strip() for line in open(e)]
def base(email):
print("getting email {}".format(email))
try:
xo = requests.get("http://www.paypal.com/xclick/business="+email, headers={"User-Agent":"Mozilla/5.0 (Windows NT 5.0; rv:21.0) Gecko/20100101 Firefox/21.0"}).text
x = re.search("s.eVar36=\"(.*?)\";", xo)
try:
if x.group(1) != "":
print "%s === %s" % (email,x.group(1))
with open(str(x.group(1))+".txt", 'a') as w:
w.write(email+"\n")
return email
except:
pass
except requests.exceptions.RequestException as e:
print(e)
THREADS = 25
pool = multiprocessing.pool.ThreadPool(THREADS)
valid = [email for email in pool.map(base, emails, chunksize=1) if email]
print(valid)
pool.close()