Python Peewee MySQL批量更新

时间:2017-08-01 01:51:21

标签: python peewee

我使用的是Python 2.7,Peewee和MySQL。如果csv中存在订单号,我的程序将从csv文件中读取并更新该字段。可能有2000-3000更新,我正在使用天真的方法逐个更新记录,这是死的慢。我已经从使用Peewee更新转移到Raw查询,这有点快。但是,它仍然很慢。我想知道如何在不使用循环的情况下以更少的事务更新记录。

def mark_as_uploaded_to_zoho(self, which_file):
    print "->Started marking the order as uploaded to zoho."
    with open(which_file, 'rb') as file:
        reader = csv.reader(file, encoding='utf-8')
        next(reader, None) ## skipping the header

        for r in reader:
            order_no = r[0]
            query = '''UPDATE sales SET UploadedToZoho=1 WHERE OrderNumber="%s" and UploadedToZoho=0''' %order_no
            SalesOrderLine.raw(query).execute()

    print "->Marked as uploaded to zoho."

1 个答案:

答案 0 :(得分:-1)

您可以使用insert_many来限制交易次数并大幅提升速度。这需要一个迭代器,它返回模型字段与字典键匹配的字典对象。

根据您尝试插入的记录数量,您可以一次完成所有记录,也可以将其分成较小的块。在过去,我一次插入了10,000多条记录,但这可能会非常慢,具体取决于数据库服务器和客户端规格,因此我将同时显示两种方式。

with open(which_file, 'rb') as file:
    reader = csv.DictReader(file)
    SalesOrderLine.insert_many(reader)

OR

# Calls a function with chunks of an iterable as list.
# Not memory efficient at all.
def chunkify(func, iterable, chunk_size):
    chunk = []
    for o in iterable:
        chunk.append(o)
        if len(chunk) > chunk_size:
            func(chunk)
            chunk = []

with open(which_file, 'rb') as file:
    reader = csv.DictReader(file)
    chunkify(SalesOrderLine.insert_many, reader, 1000)

要更有效地“整理”迭代器,请结帐this question

只需按here概述with db.atomic使用,即可提高速度。