Peewee python2.7数据库锁定错误

时间:2016-01-24 18:39:07

标签: python multithreading python-2.7 sqlite peewee

我有一个脚本,在数据库中查询“博客”,并为每个博客启动一个线程来查询其RSS地址并检查新帖子并将其记录在数据库中。最初,我运行这个脚本最多有两个并行线程(同时从至少两个博客的rss中检索信息)然后,我开始得到这个“数据库锁定错误”,现在我把它减少到一个,我仍然得到这个错误。

对于数据库连接和ORM我正在使用peewee 2.7.4:

from peewee import *
from playhouse.sqlite_ext import SqliteExtDatabase

db = SqliteExtDatabase(APP_DIR + '/ml.db')

class BaseModel(Model):
    class Meta:
       database = db

class Blog(BaseModel):
     (...)

class Post(BaseModel):
     (...)

所以,启动它的脚本:

def start():
    global ACTIVE_THREADS, MAX_THREADS
    blogs = Blog.select().where(Block.active=1)
    for blog in blogs:
        while ACTIVE_THREADS == MAX_THREADS:
            print 'Max number of threads %d reached. zzzz' % MAX_THREADS
            time.sleep(1)
        blog.processing=1
        blog.save()

        ACTIVE_THREADS += 1

        th = threading.Thread(target=process_blog, args=(blog,))
        th.daemon = True
        th.start()

def process_blog(blog):
    globals ACTIVE_THREADS
    get_new_posts_url_for_blog(blog) # here Post records are created with downloaded=0
    posts = Post.select().where(Post.downloaded = 0)
    for post in posts:
        content = get_content_for_post(post.url)
        post.content = content
        post.downloaded = 1
        post.save() #This is where the database locked error is thrown :(
     ACTIVE_THREADS -= 1

这是脚本的简化版本,当然,但基本上就是它,并且在“帖子”的第一个循环中,我在post.save()上得到以下错误:

File "/home/thilux/virtual_envs/ptmla/local/lib/python2.7/site-packages/peewee.py", line 4573, in save
rows = self.update(**field_dict).where(self._pk_expr()).execute()
File "/home/thilux/virtual_envs/ptmla/local/lib/python2.7/site-packages/peewee.py", line 3013, in execute
return self.database.rows_affected(self._execute())
File "/home/thilux/virtual_envs/ptmla/local/lib/python2.7/site-packages/peewee.py", line 2555, in _execute
return self.database.execute_sql(sql, params, self.require_commit)
File "/home/thilux/virtual_envs/ptmla/local/lib/python2.7/site-packages/peewee.py", line 3366, in execute_sql
self.commit()
File "/home/thilux/virtual_envs/ptmla/local/lib/python2.7/site-packages/peewee.py", line 3212, in __exit__
reraise(new_type, new_type(*exc_args), traceback)
File "/home/thilux/virtual_envs/ptmla/local/lib/python2.7/site-packages/peewee.py", line 3359, in execute_sql
cursor.execute(sql, params or ())
OperationalError: database is locked

请记住,现在,我正在使用MAX_THREADS = 1运行,因此一次只能处理一个博客。令我困扰的是,在第一次运行时,我会以MAX_THREADS = 2运行它,它会一直运行,就好了。这个错误刚刚开始几天,所以我不知道是否可能在博客上选择,在主线程上,事情被锁定(也许选择是附加的,我必须以某种方式分离)。有人可以帮我这个吗?这实际上是一个很小的过程,我不想为另一个数据库引擎进行更改,我看到性能优势,这对于并行运行至少2个线程也是至关重要的。

非常感谢您的帮助。

谢谢你, TS

2 个答案:

答案 0 :(得分:0)

尝试使用atomic()上下文管理器包装您的写入。

答案 1 :(得分:-2)

不知道peewee是什么,并且没有看到正在执行的SQL查询,我无法确切地说出错误。你可以做两件事: 1.检查是否有方法可以打印出正在执行的实际SQL语句。 2.有关可能的原因,请参阅this link