将处理的数据多重处理到sqlite时出错

时间:2018-07-04 00:25:16

标签: python python-3.x web-scraping sqlite multiprocessing

我正在尝试解析一堆链接并将解析后的数据附加到sqlite3。我收到sqlite3数据库已锁定的错误消息,也许这是因为我使用的池值太高了吗?我试图将其降低到5,但仍然出现以下错误。

我的代码基本上看起来像这样:

from multiprocessing import Pool

with Pool(5) as p:
    p.map(parse_link, links)

我的真实代码如下:

with Pool(5) as p:
    p.map(Get_FT_OU, file_to_set('links.txt'))
    # Where Get_FT_OU(link) appends links to a sqlite3 database.

代码运行时,我经常会遇到这些错误。有人可以帮我修复它吗?

    multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/Users/christian/Documents/GitHub/odds/CP_Parser.py", line 166, in Get_FT_OU
    cursor.execute(sql_str)
sqlite3.OperationalError: database is locked
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/christian/Documents/GitHub/odds/CP_Parser.py", line 206, in <module>
    p.map(Get_FT_OU, file_to_set('links.txt'))
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 266, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
sqlite3.OperationalError: database is locked
>>> 

我可以在不使用多处理的情况下很好地运行代码,并且实际上在使用Pool(2)时也不会出现任何错误,但是如果我走得更高,则会出现这些错误。我正在使用最新的MacBook Air。

1 个答案:

答案 0 :(得分:0)

通过在连接中添加timeout = 10以某种方式起作用

conn = sqlite3.connect(DB_FILENAME, timeout=10)