使用cx_Oracle和多处理同时查询数据

时间:2014-05-09 19:27:51

标签: python multiprocessing cx-oracle

所有

我正在尝试从Oracle数据库访问和处理大量数据。所以我使用多处理模块来产生50个进程来访问数据库。为了避免打开50个物理连接,我尝试使用cx_Oracle中的会话池。所以代码如下所示。但是我总是遇到一个破坏性的错误。我知道cx_Oracle有酸洗问题,但我想我是通过使用全局变量来解决它的。任何人都可以帮忙。

import sys
import cx_Oracle
import os
from multiprocessing import Pool

 # Read a list of ids from the input file
 def ReadList(inputFile):
        ............


def GetText(applId):
        global sPool
        connection = sPool.acquire()
        cur = connection.cursor()
        cur.prepare('Some Query')
        cur.execute(None, appl_id = applId)
        result = cur.fetchone()
        title = result[0]
        abstract = result[2].read()
        sa = result[3].read()
        cur.close()
        sPool.release(connection)
        return (title, abstract, sa)
if __name__=='__main__':
        inputFile = sys.argv[1]
        ids = ReadList(inputFile)
        dsn = cx_Oracle.makedsn('xxx', ...)
        sPool=cx_Oracle.SessionPool(....., min=1, max=10, increment=1)
        pool = Pool(10)
        results = pool.map(GetText, ids)


Exception in thread Thread-2:
Traceback (most recent call last):
  File "/usr/lib/python2.6/threading.py", line 525, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.6/threading.py", line 477, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/python2.6/multiprocessing/pool.py", line 282, in _handle_results
task = get()
UnpicklingError: NEWOBJ class argument has NULL tp_new

2 个答案:

答案 0 :(得分:0)

首先,您的代码会导致错误" NameError:全局名称' sPool'未定义",因此sPool=cx_Oracle.SessionPool(....., min=1, max=10, increment=1)必须高于def GetText(applId):

对我来说,此代码在更改from multiprocessing import Poolfrom multiprocessing.dummy import Pool之后开始正常工作,并添加参数threaded=True以调用cx_Oracle.SessionPool作为sPool=cx_Oracle.SessionPool(....., min=1, max=10, increment=1, threaded=True)

答案 1 :(得分:0)

您期望50个进程如何使用相同的进程内管理数据库连接(池)?!