通过使用多处理查询mongodb生成多个图

时间:2019-05-03 02:24:30

标签: python-3.x mongodb matplotlib multiprocessing

我想加快从mongodb地图集查找数据的绘图功能。我从网上使用示例,但是我不确定这是否正确。使用multiprocessing.Pool()似乎比不使用软件包慢。我究竟做错了什么?谢谢。

jupyter notebook output

from pymongo import MongoClient
from matplotlib.backends.backend_svg import FigureCanvasSVG
from matplotlib.figure import Figure
import io
import multiprocessing
import time

lstOfwavelengths = list(range(220,810,10))

def build_graph_mongo_multiproc(pltcodeWithSuffix,wellID):
    client = MongoClient()
    db = client.databasename
    img = io.BytesIO()
    fig = Figure(figsize=(0.6,0.6))
    axis = fig.add_subplot(1,1,1)
    absvals = db[pltcodeWithSuffix].find({"Wavelength":wavelength})
    absvals = {k:v for k,v in absvals[0].items() if k}
    axis.plot(lstOfwavelengths,absvals)
    axis.set_title(f'{pltcodeWithSuffix}:{wellID}',fontsize=9)
    axis.title.set_position([.5, .6])
    axis.tick_params(
            which='both',
            bottom=False,
            left=False,
            labelbottom=False,
            labelleft=False)
    FigureCanvasSVG(fig).print_svg(img)
    lstOfPlts.append(img.getvalue() )

与singleproc和multiproc函数的唯一区别是,在该函数之外,一次调用了MongoClient。

1 个答案:

答案 0 :(得分:0)

我发现了这篇很棒的文章:The efficient way of using multiprocessing with pymongo

使用本文作为模板,我能够将计算时间从21秒减少到〜7.5秒。我敢肯定,经验丰富的人可以节省更多时间,但是我认为这足以满足我的水平。

manager = multiprocessing.Manager()
lstOfPlots = manager.list()

def chunks(l, n):
    for i in range(0, len(l), n):
        yield l[i:i + n]

def getAllWellVals(db,pltcodeWithSuffix,wellID):
    lstOfVals = []
    for i in db[pltcodeWithSuffix].find({}, {wellID:1,'_id':0}):
        lstOfVals.append(i[wellID])
    return lstOfVals

def build_graph_mongo_multiproc(chunk,pltcodeWithSuffix):
    global lstOfPlots
    client=MongoClient(connect_string,maxPoolSize=10000)
    db = client[dbname]
    #loop over the id's in the chunk and do the plotting with each
    for wid in chunk:
        #do the plotting with document collection.find_one(id)
        img = io.BytesIO()
        fig = Figure(figsize=(0.6,0.6))
        axis = fig.add_subplot(1,1,1)
        absVals = getAllWellVals(db,pltcodeWithSuffix,wid)
        axis.plot(lstOfwavelengths,absVals)
        axis.set_title(f'{wid}',fontsize=9)
        axis.title.set_position([.5, .6])
        axis.tick_params(
                which='both',
                bottom=False,
                left=False,
                labelbottom=False,
                labelleft=False)
        FigureCanvasSVG(fig).print_svg(img)
        result = img.getvalue()
        lstOfPlots.append(result)