Python多进程 - PicklingError:不能发酵

时间:2017-08-29 14:46:33

标签: python multiprocessing pickle

以下类Downloader应该多次查询SQL数据库并将结果存储在pandas.DataFrame个对象列表中。

我想使用multiprocessing加速检索,但是我收到了错误

    line 53, in run_queries
    dfs_queries = p.map(run_query, queries)
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 251, in map
    return self.map_async(func, iterable, chunksize).get()
  File "/usr/lib/python2.7/multiprocessing/pool.py", line 567, in get
    raise self._value
PicklingError: Can't pickle <type 'function'>: attribute lookup __builtin__.function failed

我查看了this问题,该问题表明pyodbc连接和游标对象无法被腌制。当pool.map(f, arglist)依赖于SQL连接时,有没有办法在multiprocessing中使用f

import pyodbc
from multiprocessing import Pool as ThreadPool
import pandas as pd

class Downloader(object):

    def _connect(self, path_db_config):
        # ... Loads a config file from which it gets dsn, user and password ... #

        con_string = 'DSN=%s;UID=%s;PWD=%s;' % (dsn, user, password)
        return pyodbc.connect(con_string)

    def run_queries(self):
        queries = [# List of sql queries #]
        p = ThreadPool(len(queries))

        def run_query(query):
            cnxn = self._connect(PATH_DB_CONFIG)
            df = pd.read_sql(query, cnxn)
            return df

        return p.map(run_query, queries)

感谢您的帮助!!

0 个答案:

没有答案