Python Pickling和多处理

时间:2014-03-15 02:31:35

标签: python multiprocessing pickle

我试图使用多处理来处理我的内存问题,但是我无法获得一个功能来解决问题,我不知道为什么。我的主要代码以

开头
def main():
    print "starting main"
    q = Queue()
    p = Process(target=file_unpacking,args=("hellow world",q))
    p.start()
    p.join()
    if p.is_alive():
        p.terminate()
    print "The results are in"
    Chan1 = q.get()
    Chan2 = q.get()
    Start_Header = q.get()
    Date = q.get()
    Time = q.get()
    return Chan1, Chan2, Start_Header, Date, Time

def file_unpacking(args, q):
    print "starting unpacking"
    fileName1 = "050913-00012"
    unpacker = UnpackingClass()
    for fileNumber in range(0,44):
        fileName = fileName1 + str(fileNumber) + fileName3
        header, data1, data2 = UnpackingClass.unpackFile(path,fileName)

        if header == None:
            logging.warning("curropted file found at " + fileName)
            Data_Sets1.append(temp_1)
            Data_Sets2.append(temp_2)
            Headers.append(temp_3)
            temp_1 = []
            temp_2 = []
            temp_3 = []
            #for i in range(0,10000):
            #    Chan1.append(0)
            #    Chan2.append(0)

        else:
            logging.info(fileName + " is good!")
            temp_3.append(header)
            for i in range(0,10000):
                temp_1.append(data1[i])
                temp_2.append(data2[i])

    Data_Sets1.append(temp_1)
    Data_Sets2.append(temp_2)
    Headers.append(temp_3)
    temp_1 = []
    temp_2 = []
    temp_3 = []

    lengths = []
    for i in range(len(Data_Sets1)):
        lengths.append(len(Data_Sets1[i]))
    index = lengths.index(max(lengths))

    Chan1 = Data_Sets1[index]
    Chan2 = Data_Sets2[index]
    Start_Header = Headers[index]
    Date = Start_Header[index][0]
    Time = Start_Header[index][1]
    print "done unpacking"
    q.put(Chan1)
    q.put(Chan2)
    q.put(Start_Header)
    q.put(Date)
    q.put(Time)

目前我在一个单独的python文件中有解包方法,它导入struct和os。这将读取部分文本部分二进制文件,对其进行构造,然后将其关闭。这主要是腿部工作,所以我还没有发布它,但是如果它有帮助的话。我会开始

class UnpackingClass:
    def __init__(self):
        print "Unpacking Class"
    def unpackFile(path,fileName):
        import struct
        import os
    .......

然后我只需要调用main()来启动派对,除了无限循环的pickle错误之外我什么也得不到。

长话短说我不知道​​如何挑选一个功能。一切都在文件的顶部定义,所以我不知所措。

以下是错误消息

Traceback (most recent call last):
 File "<string>", line 1, in <module>
 File "C:\Users\Casey\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.1.0.1371.win-x86_64\lib\multiprocessing\forking.py", line 373, in main
prepare(preparation_data)
 File "C:\Users\Casey\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.1.0.1371.win-x86_64\lib\multiprocessing\forking.py", line 488, in prepare
'__parents_main__', file, path_name, etc
 File "A:\598\TestCode\test1.py", line 142, in <module>
Chan1, Chan2, Start_Header, Date, Time = main()
 File "A:\598\TestCode\test1.py", line 43, in main
p.start()
  File "C:\Users\Casey\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.1.0.1371.win-x86_64\lib\multiprocessing\process.py", line 130, in start
self._popen = Popen(self)
  File "C:\Users\Casey\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.1.0.1371.win-x86_64\lib\multiprocessing\forking.py", line 271, in __init__
dump(process_obj, to_child, HIGHEST_PROTOCOL)
  File "C:\Users\Casey\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.1.0.1371.win-x86_64\lib\multiprocessing\forking.py", line 193, in dump
ForkingPickler(file, protocol).dump(obj)
  File "C:\Users\Casey\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.1.0.1371.win-x86_64\lib\pickle.py", line 224, in dump
self.save(obj)
  File "C:\Users\Casey\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.1.0.1371.win-x86_64\lib\pickle.py", line 331, in save
self.save_reduce(obj=obj, *rv)
 File "C:\Users\Casey\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.1.0.1371.win-x86_64\lib\pickle.py", line 419, in save_reduce
save(state)
 File "C:\Users\Casey\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.1.0.1371.win-x86_64\lib\pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
 File "C:\Users\Casey\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.1.0.1371.win-x86_64\lib\pickle.py", line 649, in save_dict
self._batch_setitems(obj.iteritems())
 File "C:\Users\Casey\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.1.0.1371.win-x86_64\lib\pickle.py", line 681, in _batch_setitems
save(v)
 File "C:\Users\Casey\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.1.0.1371.win-x86_64\lib\pickle.py", line 286, in save
f(self, obj) # Call unbound method with explicit self
  File "C:\Users\Casey\AppData\Local\Enthought\Canopy\App\appdata\canopy-1.1.0.1371.win-x86_64\lib\pickle.py", line 748, in save_global
(obj, module, name))
pickle.PicklingError: Can't pickle <function file_unpacking at 0x0000000007E1F048>: it's    not found as __main__.file_unpacking

2 个答案:

答案 0 :(得分:4)

如果你想进行任何并行计算,腌制函数是一件非常相关的事情。 Python的picklemultiprocessing在进行并行计算时非常糟糕,所以如果你不喜欢走出标准库,我建议dill进行序列化,pathos.multiprocessing 1}}作为multiprocessing替代品。 dill可以在python中序列化几乎任何内容,而pathos.multiprocessing使用dill来提供更强大的并行CPU使用。有关更多信息,请参阅:

What can multiprocessing and dill do together?

或这个简单的例子:

Python 2.7.6 (default, Nov 12 2013, 13:26:39) 
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> from pathos.multiprocessing import ProcessingPool
>>> 
>>> def squared(x):
...   return x**2
... 
>>> pool = ProcessingPool(4)
>>> pool.map(squared, range(10))
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
>>> res = pool.amap(squared, range(10))
>>> res.get()
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
>>> res = pool.imap(squared, range(10))
>>> list(res)
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
>>> 
>>> def add(x,y):
...   return x+y
... 
>>> pool.map(add, range(10), range(10))
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
>>> res = pool.amap(add, range(10), range(10))
>>> res.get()
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
>>> res = pool.imap(add, range(10), range(10))
>>> list(res)
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

dillpathos都可在此处找到:https://github.com/uqfoundation

答案 1 :(得分:0)

你可以在技术上挑选一个功能。但是,它只是一个被保存的名称参考。当你unpickle时,你必须设置环境,以便名称引用对python有意义。请务必仔细阅读What can be pickled and unpicked

如果这不能回答您的问题,您需要向我们提供确切的错误消息。另外,请解释酸洗功能的目的。既然你只能腌制名字参考而不是功能本身,为什么你不能简单地导入和调用相应的代码呢?