在python中通过多处理删除文件

时间:2018-06-21 15:59:38

标签: multiprocessing python-3.5

我正在使用以下代码删除python中的大量文件:

import os
from multiprocessing import Pool

def deleteFiles(loc):
    def Fn_deleteFiles(inp):
        [fn, loc] = [inp['fn'], inp['loc']]
        os.remove(os.path.join(loc, fn))

    p = Pool(5)
    for path, subdirs, files in os.walk(loc):
        if len(files) > 0:
            inpData = [{'fn':x, 'loc':loc} for x in files]
            p.map(Fn_deleteFiles, inpData)
    p.close()

if __name__ == '__main__':
    loc = r'C:\myDriveWithFilesToDelete'
    deleteFiles(loc)

我收到以下错误:

  File "C:\Program Files\Python 3.5\lib\multiprocessing\reduction.py", line 50, in dumps
    cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'deleteFiles.<locals>.Fn_deleteFiles'

1 个答案:

答案 0 :(得分:1)

问题在于您正在函数内部创建函数。

函数Fn_deleteFiles(inp)deleteFiles(loc)内部定义。

这意味着Fn_deleteFiles(inp)仅在运行deleteFiles(loc)时生成。

问题在于,内部multiprocessing.pool.Pool()调用pickle库将函数对象从该python进程转移到一个正在生成的新python函数。

但是,如果 pickle无法找到功能的字符串,则无法定位它。

这里是一个演示类似错误的演示。

import pickle
def foo():
    def bar():
        return "Hello"
    return bar

bar = foo()

if __name__ == '__main__':
    s = pickle.dumps(bar)

将导致相同的错误:

Traceback (most recent call last):
  File ".../stacktest.py", line 10, in <module>
    s = pickle.dumps(bar)
AttributeError: Can't pickle local object 'foo.<locals>.bar'

因此,要解决此错误,您可以改用multiprocessing.pool.ThreadPool,因为它不会腌制。

import os
from multiprocessing.pool import ThreadPool as Pool
def deleteFiles(loc):
    def Fn_deleteFiles(inp):
        [fn, loc] = [inp['fn'], inp['loc']]
        os.remove(os.path.join(loc, fn))
    p = Pool(5)
    for path, subdirs, files in os.walk(loc):
        if len(files) > 0:
            inpData = [{'fn':x, 'loc':loc} for x in files]
            p.map(Fn_deleteFiles, inpData)
    p.close()
if __name__ == '__main__':
    loc = 'DriveWithFilesToDelete'
    deleteFiles(loc)

或者,您可以在Fn_deleteFiles(inp)之外定义deleteFiles(loc)来解决此问题。

警告 由于我不理解的原因,此答案将挂在空闲解释器内部。

import os
from multiprocessing import Pool

def Fn_deleteFiles(inp):
    print("Delete", inp)
    [fn, loc] = [inp['fn'], inp['loc']]
    os.remove(os.path.join(loc, fn))

def deleteFiles(loc):
    p = Pool(5)
    for path, subdirs, files in os.walk(loc):
        if len(files) > 0:
            inpData = [{'fn':x, 'loc':loc} for x in files]
            p.map(Fn_deleteFiles, inpData)
    p.close()

if __name__ == '__main__':
    loc = 'DriveWithFilesToDelete'
    deleteFiles(loc)