Question

我正在开发一个线程应用程序，其中一个线程将向Queue提供要修改的对象，然后许多其他线程将从队列中读取，进行修改并保存更改。

应用程序不需要很多并发，所以我想坚持使用SQLite数据库。这是一个说明应用程序的小例子：

import queue
import threading
import peewee as pw

db = pw.SqliteDatabase('test.db', threadlocals=True)

class Container(pw.Model):
    contents = pw.CharField(default="spam")

    class Meta:
        database = db


class FeederThread(threading.Thread):

    def __init__(self, input_queue):
        super().__init__()

        self.q = input_queue

    def run(self):
        containers = Container.select()

        for container in containers:
            self.q.put(container)


class ReaderThread(threading.Thread):

    def __init__(self, input_queue):
        super().__init__()

        self.q = input_queue

    def run(self):
        while True:
            item = self.q.get()

            with db.execution_context() as ctx:
                # Get a new connection to the container object:
                container = Container.get(id=item.id)
                container.contents = "eggs"
                container.save()

            self.q.task_done()


if __name__ == "__main__":

    db.connect()
    try:
        db.create_tables([Container,])
    except pw.OperationalError:
        pass
    else:
        [Container.create() for c in range(42)]
    db.close()

    q = queue.Queue(maxsize=10)


    feeder = FeederThread(q)
    feeder.setDaemon(True)
    feeder.start()

    for i in range(10):
        reader = ReaderThread(q)
        reader.setDaemon(True)
        reader.start()

    q.join()

基于peewee docs，SQLite应该支持多线程。但是，我一直收到臭名昭着的peewee.OperationalError: database is locked错误，错误输出指向container.save()行。

我该如何解决这个问题？

Answer 1

我很惊讶地看到这种情况也失败了，所以我复制了你的代码并玩了一些不同的想法。我认为问题是，ExecutionContext()默认情况下会导致包装块在事务中运行。为了避免这种情况，我在读者线程中传递了False。

在将内容放入队列（list(Container.select())）之前，我还编辑了使用者以使用SELECT语句。

以下适合我的工作：

class FeederThread(threading.Thread):

    def __init__(self, input_queue):
        super(FeederThread, self).__init__()

        self.q = input_queue

    def run(self):
        containers = list(Container.select())

        for container in containers:
            self.q.put(container.id)  # I don't like passing model instances around like this, personal preference though

class ReaderThread(threading.Thread):

    def __init__(self, input_queue):
        super(ReaderThread, self).__init__()

        self.q = input_queue

    def run(self):
        while True:
            item = self.q.get()

            with db.execution_context(False):
                # Get a new connection to the container object:
                container = Container.get(id=item)
                container.contents = "nuggets"
                with db.atomic():
                    container.save()

            self.q.task_done()

if __name__ == "__main__":

    with db.execution_context():
        try:
            db.create_tables([Container,])
        except OperationalError:
            pass
        else:
            [Container.create() for c in range(42)]

    # ... same ...

我对此并不完全满意，但希望它会给你一些想法。

这是我之前写过的一篇博客文章，其中提供了一些使用SQLite获得更高并发性的技巧：http://charlesleifer.com/blog/sqlite-small-fast-reliable-choose-any-three-/

Answer 2

你试过WAL模式吗？

Improve INSERT-per-second performance of SQLite?

如果您具有对SQLite的并发访问权限，则必须非常小心，因为在完成写入时整个数据库都被锁定，尽管可能有多个读取器，但写入将被锁定。通过在较新的SQLite版本中添加WAL，这有所改善。

和

如果您使用多个线程，则可以尝试使用共享页面缓存，这将允许在线程之间共享加载的页面，这可以避免昂贵的I / O调用。

Peewee，SQLite和线程

2 个答案: