用莳萝和pymongo电话取代泡菜

时间:2016-04-20 22:20:10

标签: parallel-processing pymongo pickle dill

我终于理解了如何用以下讨论中的莳萝取代泡菜的例子:pickle-dill。 例如,以下代码适用于我

import os
import dill
import multiprocessing

def run_dill_encoded(what):
    fun, args = dill.loads(what)
    return fun(*args)

def apply_async(pool, fun, args):
    return pool.apply_async(run_dill_encoded, (dill.dumps((fun, args)),))

if __name__ == '__main__':

    pool = multiprocessing.Pool(5)
    results = [apply_async(pool, lambda x: x*x, args=(x,)) for x in range(1,7)]
    output = [p.get() for p in results]
    print(output)

我试图将相同的哲学应用于pymongo。以下代码

import os
import dill
import multiprocessing
import pymongo

def run_dill_encoded(what):
    fun, args = dill.loads(what)
    return fun(*args)


def apply_async(pool, fun, args):
    return pool.apply_async(run_dill_encoded, (dill.dumps((fun, args)),))


def write_to_db(value_to_insert):           
    client = pymongo.MongoClient('localhost',  27017)
    db = client['somedb']
    collection = db['somecollection']
    result = collection.insert_one({"filed1": value_to_insert})
    client.close()

if __name__ == '__main__':
    pool = multiprocessing.Pool(5)
    results = [apply_async(pool, write_to_db, args=(x,)) for x in ['one', 'two', 'three']]
    output = [p.get() for p in results]
    print(output)

产生错误:

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\Python34\lib\multiprocessing\pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "C:\...\temp2.py", line 10, in run_dill_encoded
    return fun(*args)
  File "C:\...\temp2.py", line 21, in write_to_db
    client = pymongo.MongoClient('localhost',  27017)
NameError: name 'pymongo' is not defined
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:/.../temp2.py", line 32, in <module>
    output = [p.get() for p in results]
  File "C:/.../temp2.py", line 32, in <listcomp>
    output = [p.get() for p in results]
  File "C:\Python34\lib\multiprocessing\pool.py", line 599, in get
    raise self._value
NameError: name 'pymongo' is not defined

Process finished with exit code 1

有什么问题?

1 个答案:

答案 0 :(得分:1)

正如我在评论中提到的,您需要在函数import pymongo中放置write_to_db。这是因为当函数被序列化时,它在运送到其他进程空间时不会带有任何全局引用。