我有一个使用线程运行良好的现有脚本,但我的事情列表越来越大,我需要限制实际使用的线程数,因为我正在杀死我的服务器。所以我想在这个脚本中添加一个Pool(100),但到目前为止我尝试的所有内容都失败了,并且出现了错误代码。任何人都可以帮助添加一个简单的池吗?我一直在四处寻找,很多游泳池都很复杂,我宁愿尽量保持这个。请注意我删除了实际的" def work(item)"因为这个脚本相当大。
import time, os, re, threading, subprocess, sys
mylist = open('list.txt', 'r')
class working (threading.Thread):
def __init__(self, item):
threading.Thread.__init__(self)
self.item = item
def run(self):
work(self.item)
def work(item):
<actual work that needs to be threaded>
threads = []
for l in mylist:
work1 = l.strip()
thread = working(work1)
threads.append(thread)
thread.start()
for t in threads: t.join()
mylist.close()
添加池时出错:
Process PoolWorker-10:
Traceback (most recent call last):
File "/usr/lib64/python2.6/multiprocessing/process.py", line 232, in _bootstrap
self.run()
File "/usr/lib64/python2.6/multiprocessing/process.py", line 88, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib64/python2.6/multiprocessing/pool.py", line 71, in worker
put((job, i, result))
File "/usr/lib64/python2.6/multiprocessing/queues.py", line 366, in put
return send(obj)
UnpickleableError: Cannot pickle <type 'thread.lock'> objects
刚刚清除的新CODE:
import time, os, re, threading, subprocess, sys
from multiprocessing.dummy import Pool as ThreadPool
mylist = open('list.txt', 'r')
class working (threading.Thread):
def __init__(self, item):
threading.Thread.__init__(self)
self.item = item
def run(self):
work(self.item)
def work(item):
<actual work that needs to be threaded>
threads = []
for l in mylist:
work1 = l.strip()
pool = ThreadPool(10)
pool.map(working, work1)
pool.close()
答案 0 :(得分:1)
多处理是一种基于流程的高级并行包。要使用进程,您需要能够在进程之间发送数据,这是错误消息告诉您的一些数据不可能(pickleable = transferable)。但是,如果您在以下位置阅读模块文档:
https://docs.python.org/2/library/multiprocessing.html#module-multiprocessing.dummy
你会发现一些名为 multiprocess.dummy 的内容。导入它,您将使用相同的接口,但使用线程而不是进程。这就是你想要的。
修改强>:
花点时间阅读多处理模块的规范。您正在做的是向池中提交单个线程对象的创建。您想要的是提交要完成的工作和要执行工作的项目。 (概念上)正确的解决方案如下所示:
def work(item):
item = item.strip()
<actual work that needs to be threaded>
pool = ThreadPool(10)
results = pool.map(work, mylist)
pool.close() # don't think this is strictly necessary
您未向池中提交主题,但您将工作提供给池中包含的主题。它是一个更高层次的抽象。希望这可以解决问题。