使用并行python和类

时间:2013-02-10 19:06:52

标签: python oop parallel-python

我很震惊地了解到互联网上有关并行python(PP)和处理类的教程和指南很少。我遇到了一个问题,我想要启动同一个类的几个实例,然后检索一些变量(例如并行读取5个数据文件,然后检索它们的数据)。这是一段简单的代码来说明我的问题:

import pp

class TestClass:
    def __init__(self, i):
        self.i = i

    def doSomething(self):
        print "\nI'm being executed!, i = "+str(self.i)
        self.j = 2*self.i
        print "self.j is supposed to be "+str(self.j)
        return self.i

class parallelClass:
    def __init__(self):
        job_server = pp.Server()
        job_list = []
        self.instances = [] # for storage of the class objects
        for i in xrange(3):
            TC = TestClass(i) # initiate a new instance of the TestClass
            self.instances.append(TC) # store the instance
            job_list.append(job_server.submit(TC.doSomething, (), ())) # add some jobs to the job_list
        results = [job() for job in job_list] # execute order 66...

        print "\nIf all went well there's a nice bunch of objects in here:"
        print self.instances
        print "\nAccessing an object's i works ok, but accessing j does not"
        print "i = "+str(self.instances[2].i)
        print "j = "+str(self.instances[2].j)

if __name__ == '__main__' :
    parallelClass() # initiate the program

为了您的方便,我添加了评论。我在这里做错了什么?

1 个答案:

答案 0 :(得分:1)

您应该使用callbacks

callbacks是您传递给submit来电的功能。该函数将以作为参数的结果(have a look at the API for more arcane usage)调用。

在您的情况下

设置回调:

class TestClass:
    def doSomething(self):
         j = 2 * self.i
         return j # It's REQUIRED that you return j here.

    def set_j(self, j):
        self.j = j

将回调添加到作业提交调用

 class parallellClass:
      def __init__(self):
          #your code...
          job_list.append(job_server.submit(TC.doSomething, callback=TC.set_j))

你已经完成了。

我对代码进行了一些改进,以避免在self.j调用中使用doSomething,并且只使用本地j变量。

如评论中所述,在pp中,您只会传达工作结果。这就是你拥有来返回这个变量的原因,它将被传递给回调。