我正在努力并没有并行运行一个巨大的循环。循环恰好是特定类的一种方法,在循环内部我称之为另一种方法。它确实有效,但由于某种原因,列表中只有一个进程,输出(参见代码)始终为“Worker 0”。要么没有创建进程,要么它们没有并行运行。结构如下:
main.py
from my_class.py import MyClass
def main():
class_object = MyClass()
class_object.method()
if __name__ == '__main__':
main()
my_class.py
from multiprocessing import Process
MyClass(object):
def __init__(self):
# do something
def _method(self, worker_num, n_workers, amount, job, data):
for i, val in enumerate(job):
print('Worker %d' % worker_num)
self.another_method(val, data)
def another_method(self):
# do something to the data
def method(self):
# definitions of data and job_size go here
n_workers = 16
chunk = job_size // n_workers
resid = job_size - chunk * n_workers
workers = []
for worker_num in range(n_workers):
st = worker_num * chunk
amount = chunk if worker_num != n_workers - 1 else chunk + resid
worker = Process(target=self._method, args=[worker_num, n_workers, amount, job[st:st+amount], data])
worker.start()
workers.append(worker)
for worker in workers:
worker.join()
return data
我已经阅读了一些有关子进程需要主模块可导入的内容,但我不知道如何在我的情况下执行此操作。
答案 0 :(得分:0)
问题:...但仍然只有一个核心正在使用中。所以问题是,我可以在Process对象中使用多个核心
这不依赖于Process
正在使用哪个CPU的Python解释器
相关:on-what-cpu-cores-are-my-python-processes-running
使用以下内容扩展您的def _method(...
,以查看实际发生的情况:
注意:
getpidcore(pid)
分发 dependend,失败!
def getpidcore(pid):
with open('/proc/{}/stat'.format(pid), 'rb') as fh:
core = int(fh.read().split()[-14])
return core
class MyClass(object):
...
def _method(self, worker_num, n_workers, amount, job, data):
for i, val in enumerate(job):
core = getpidcore(os.getpid())
print('core:{} pid:{} Worker({})'.format(core, os.getpid(), (worker_num, n_workers, amount, job)))
输出:
core:1 pid:7623 Worker((0, 16, 1, [1])) core:1 pid:7625 Worker((2, 16, 1, [3])) core:0 pid:7624 Worker((1, 16, 1, [2])) core:1 pid:7626 Worker((3, 16, 1, [4])) core:1 pid:7628 Worker((5, 16, 1, [6])) core:0 pid:7627 Worker((4, 16, 1, [5]))
使用Python测试:Linux上的3.4.2