具有namedtuple的多处理对象 - Pickling Error

时间:2014-03-10 15:47:04

标签: python object multiprocessing pickle namedtuple

我在想要放入多处理的对象中使用namedtuples时遇到问题。我收到了酸洗错误。我尝试过其他stackoverflow帖子中的一些东西,但我没能成功。这是我的代码的结构:

package_main,test_module

 import myprogram.package_of_classes.data_object_module
 import ....obj_calculate

 class test(object):
       if __name__ == '__main__':
             my_obj=create_obj('myobject',['f1','f2'])
             input = multiprocessing.Queue()
             output = multiprocessing.Queue()
             input.put(my_obj)
             j=Process(target=obj_calculate, args=(input,output))
             j.start()

package_of_classes,data_object_module

 import collections
 import ....load_flat_file

 def get_ntuple_format(obj):
     nt_fields=''
     for fld in obj.fields:
         nt_fields=nt_fields+fld+', '
     nt_fields=nt_fields[0:-2]
     ntuple=collections.namedtuple('ntuple_format',nt_fields)
     return ntuple

 Class Data_obj:
    def __init__(self, name,fields):
        self.name=name
        self.fields=fields
        self.ntuple_form=get_ntuple_format(self)  

    def calculate(self):
        self.file_read('C:/files','division.txt')

    def file_read(self,data_directory,filename):
        output=load_flat_file(data_directory,filename,self.ntuple_form)
        self.data=output

utils_package,utils_module

def create_dataobj(name,fields):
    locals()[name]=Data_Obj(name,fields)
    return locals()[name]  

def obj_calculate(input,output):   
    obj=input.get()
    obj.calculate()
    output.put(obj)

loads_module

def load_flat_file(data_directory,filename,ntuple_form):
     csv.register_dialect('csvrd', delimiter='\t', quoting=csv.QUOTE_NONE)
     ListofTuples=[]
     with open(os.path.join(data_directory,filename), 'rb') as f:
          reader = csv.reader(f,'csvrd')
          for line in reader:
               if line:
                   ListofTuples.append(ntuple_form._make(line))
     return ListofTuples

我得到的错误是:

PicklingError: PicklingError: Can't pickle  class '__main__ . ntuple_format: it's not the same object as __ main __. ntuple_format

P.S。当我从大型项目中提取此示例代码时,请忽略轻微的不一致。

2 个答案:

答案 0 :(得分:7)

你不能挑选你动态创建的类(在这种情况下,一个命名的元组)(通过get_ntuple_format)。对于可选择的类,it has to be defined位于可导入模块的顶层。

如果您只需要支持几种元组,请考虑提前在模块的顶层定义它们,然后动态选择正确的元组。如果您需要完全动态的容器格式,请考虑使用dict代替。

答案 1 :(得分:3)

我认为你可以挑选namedtuple以及class中定义的__main__

>>> import dill as pickle
>>> import collections
>>> 
>>> thing = collections.namedtuple('thing', ['a','b'])
>>> pickle.loads(pickle.dumps(thing))
<class '__main__.thing'>

这是同样的事情,在类方法中使用。

>>> class Foo(object):
...   def bar(self, a, b):
...     thing = collections.namedtuple('thing', ['a','b'])     
...     thing.a = a 
...     thing.b = b
...     return thing 
... 
>>> f = Foo()
>>> q = f.bar(1,2)
>>> q.a
1
>>> q.b
2
>>> q._fields
('a', 'b')
>>> 
>>> pickle.loads(pickle.dumps(Foo.bar))
<unbound method Foo.bar>
>>> pickle.loads(pickle.dumps(f.bar))
<bound method Foo.bar of <__main__.Foo object at 0x10dbf5450>>

您只需使用dill代替pickle

在此处获取dillhttps://github.com/uqfoundation