joblib BrokenProcessPool:函数的参数不可腌制

时间:2019-07-04 12:59:09

标签: python pandas joblib

我正在尝试使用多重处理,以便从数据帧列表中的每个df中过滤掉一些内容:

# Accepts and returns a list of dataframes
dataframes = myClass.filter_calibration(dataframes)

在myClass中:

self.n_cores = -2
self.backend = 'loky'

def filter_calibration(self, dataframes, verbose=True):
    results = Parallel(n_jobs=self.n_cores, backend=self.backend, verbose=verbose)(
              delayed(helperFunctions.filter_calibration_helper)(df) for df in dataframes)

    return results

在带有辅助函数的单独的.py文件中(不在类中)

def filter_calibration_helper(df):
    if True in df['Calibrating'].unique():
        calib = np.array(df['Calibrating'])
        valids = np.array(df['Valid'])
        calib_indices = np.argwhere(calib == True)
        valids[calib_indices] = False
        df['Valid'] = valids
    else:
        calib = np.array(df['Calibrating'])
        valids = np.array(df['Valid'])
        calib_indices = np.argwhere(calib == 1)
        valids[calib_indices] = False
        df['Valid'] = valids

    return df

但是,我不断收到错误消息:


  File "/Users/ima/example_script.py", line 45, in <module>
    dataframes = myClass.filter_calibration(dataframes)

  File "/Users/ima/myClass.py", line 59, in filter_calibration
    delayed(helperFunctions.filter_calibration_helper)(df) for df in zip(dataframes))

  File "/Users/ima/anaconda3/lib/python3.7/site-packages/joblib/parallel.py", line 934, in __call__
    self.retrieve()

  File "/Users/ima/anaconda3/lib/python3.7/site-packages/joblib/parallel.py", line 833, in retrieve
    self._output.extend(job.get(timeout=self.timeout))

  File "/Users/ima/anaconda3/lib/python3.7/site-packages/joblib/_parallel_backends.py", line 521, in wrap_future_result
    return future.result(timeout=timeout)

  File "/Users/ima/anaconda3/lib/python3.7/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()

  File "/Users/ima/anaconda3/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception

BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.

对于我的一生,我不知道出了什么问题。唯一的参数是一个数据帧,该数据帧在我的代码中的其他一些函数中与Parallel一起正常工作。

我也尝试过joblib的multiprocessing后端,但这停滞了。

感谢所有帮助!

0 个答案:

没有答案