使用Django将数千条记录插入SQLite表的有效方法是什么?

时间:2009-07-16 08:08:31

标签: python sql django sqlite insert

我必须使用Django的ORM将8000多条记录插入到SQLite数据库中。此操作需要每分钟大约运行一次cronjob 目前我正在使用for循环遍历所有项目,然后逐个插入它们 例如:

for item in items:
    entry = Entry(a1=item.a1, a2=item.a2)
    entry.save()

这样做的有效方法是什么?

编辑:两种插入方法之间的一点比较。

没有commit_manually装饰器(11245条记录):

nox@noxdevel marinetraffic]$ time python manage.py insrec             

real    1m50.288s
user    0m6.710s
sys     0m23.445s

使用commit_manually decorator(11245条记录):

[nox@noxdevel marinetraffic]$ time python manage.py insrec                

real    0m18.464s
user    0m5.433s
sys     0m10.163s

注意:除了插入数据库之外, test 脚本还会执行一些其他操作(下载ZIP文件,从ZIP存档中提取XML文件,解析XML因此,执行所需的时间不一定代表插入记录所需的时间。

9 个答案:

答案 0 :(得分:115)

您想查看django.db.transaction.commit_manually

http://docs.djangoproject.com/en/dev/topics/db/transactions/#django-db-transaction-commit-manually

所以它会是这样的:

from django.db import transaction

@transaction.commit_manually
def viewfunc(request):
    ...
    for item in items:
        entry = Entry(a1=item.a1, a2=item.a2)
        entry.save()
    transaction.commit()

只提交一次,而不是每次保存()。

在django 1.3中,介绍了管理员。 所以现在你可以用类似的方式使用 transaction.commit_on_success()

from django.db import transaction

def viewfunc(request):
    ...
    with transaction.commit_on_success():
        for item in items:
            entry = Entry(a1=item.a1, a2=item.a2)
            entry.save()

在django 1.4中,添加了bulk_create,允许您创建模型对象的列表,然后一次提交它们。

注意使用批量创建时不会调用save方法。

>>> Entry.objects.bulk_create([
...     Entry(headline="Django 1.0 Released"),
...     Entry(headline="Django 1.1 Announced"),
...     Entry(headline="Breaking: Django is awesome")
... ])

在django 1.6中,引入了 transaction.atomic ,旨在取代现有的遗留函数commit_on_successcommit_manually

来自django documentation on atomic

atomic可用作装饰器:

from django.db import transaction

@transaction.atomic
def viewfunc(request):
    # This code executes inside a transaction.
    do_stuff()

并作为上下文管理员:

from django.db import transaction

def viewfunc(request):
    # This code executes in autocommit mode (Django's default).
    do_stuff()

    with transaction.atomic():
        # This code executes inside a transaction.
        do_more_stuff()

答案 1 :(得分:11)

答案 2 :(得分:3)

看看this。它只适用于带有MySQL的开箱即用,但有关于如何为其他数据库做些什么的指示。

答案 3 :(得分:3)

批量加载项目可能会更好 - 准备文件并使用批量加载工具。这将比8000个单独的刀片更有效。

答案 4 :(得分:2)

你应该看看DSE。我写了DSE来解决这些问题(大量插入或更新)。使用django orm是一个死胡同,你必须在普通的SQL中完成它,DSE会为你处理大部分内容。

托马斯

答案 5 :(得分:2)

特别是关于SQLite的问题,正如我刚才确认的那样,我刚刚确认bulk_create确实提供了巨大的加速,但SQLite存在一个限制:“默认是在一个批次中创建所有对象,除了SQLite,其默认值是每个查询最多使用999个变量。“

引用的东西来自文档--- A-IV提供了一个链接。

我要补充的是,alpar的this djangosnippets条目似乎对我有用。它是一个小包装器,可以将您要处理的大批量打包成较小的批次,管理999变量限制。

答案 6 :(得分:0)

def order(request):    
    if request.method=="GET":
        # get the value from html page
        cust_name = request.GET.get('cust_name', '')
        cust_cont = request.GET.get('cust_cont', '')
        pincode = request.GET.get('pincode', '')
        city_name = request.GET.get('city_name', '')
        state = request.GET.get('state', '')
        contry = request.GET.get('contry', '')
        gender = request.GET.get('gender', '')
        paid_amt = request.GET.get('paid_amt', '')
        due_amt = request.GET.get('due_amt', '')
        order_date = request.GET.get('order_date', '')
        prod_name = request.GET.getlist('prod_name[]', '')
        prod_qty = request.GET.getlist('prod_qty[]', '')
        prod_price = request.GET.getlist('prod_price[]', '')

        # insert customer information into customer table
        try:
            # Insert Data into customer table
            cust_tab = Customer(customer_name=cust_name, customer_contact=cust_cont, gender=gender, city_name=city_name, pincode=pincode, state_name=state, contry_name=contry)
            cust_tab.save()
            # Retrive Id from customer table
            custo_id = Customer.objects.values_list('customer_id').last()   #It is return Tuple as result from Queryset
            custo_id = int(custo_id[0]) #It is convert the Tuple in INT
            # Insert Data into Order table
            order_tab = Orders(order_date=order_date, paid_amt=paid_amt, due_amt=due_amt, customer_id=custo_id)
            order_tab.save()
            # Insert Data into Products table
            # insert multiple data at a one time from djanog using while loop
            i=0
            while(i<len(prod_name)):
                p_n = prod_name[i]
                p_q = prod_qty[i]
                p_p = prod_price[i]

                # this is checking the variable, if variable is null so fill the varable value in database
                if p_n != "" and p_q != "" and p_p != "":
                    prod_tab = Products(product_name=p_n, product_qty=p_q, product_price=p_p, customer_id=custo_id)
                    prod_tab.save()
                i=i+1

            return HttpResponse('Your Record Has been Saved')
        except Exception as e:
            return HttpResponse(e)     

    return render(request, 'invoice_system/order.html')

答案 7 :(得分:-1)

我建议使用纯SQL(而非ORM),您可以使用单个插入插入多行:

insert into A select from B;

只要结果与表A中的列匹配并且没有约束冲突,从B 中选择部分可能会像您希望的那样复杂。

答案 8 :(得分:-2)

def order(request):    
    if request.method=="GET":
        cust_name = request.GET.get('cust_name', '')
        cust_cont = request.GET.get('cust_cont', '')
        pincode = request.GET.get('pincode', '')
        city_name = request.GET.get('city_name', '')
        state = request.GET.get('state', '')
        contry = request.GET.get('contry', '')
        gender = request.GET.get('gender', '')
        paid_amt = request.GET.get('paid_amt', '')
        due_amt = request.GET.get('due_amt', '')
        order_date = request.GET.get('order_date', '')
        print(order_date)
        prod_name = request.GET.getlist('prod_name[]', '')
        prod_qty = request.GET.getlist('prod_qty[]', '')
        prod_price = request.GET.getlist('prod_price[]', '')
        print(prod_name)
        print(prod_qty)
        print(prod_price)
        # insert customer information into customer table
        try:
            # Insert Data into customer table
            cust_tab = Customer(customer_name=cust_name, customer_contact=cust_cont, gender=gender, city_name=city_name, pincode=pincode, state_name=state, contry_name=contry)
            cust_tab.save()
            # Retrive Id from customer table
            custo_id = Customer.objects.values_list('customer_id').last()   #It is return
Tuple as result from Queryset
            custo_id = int(custo_id[0]) #It is convert the Tuple in INT
            # Insert Data into Order table
            order_tab = Orders(order_date=order_date, paid_amt=paid_amt, due_amt=due_amt, customer_id=custo_id)
            order_tab.save()
            # Insert Data into Products table
            # insert multiple data at a one time from djanog using while loop
            i=0
            while(i<len(prod_name)):
                p_n = prod_name[i]
                p_q = prod_qty[i]
                p_p = prod_price[i]
                # this is checking the variable, if variable is null so fill the varable value in database
                if p_n != "" and p_q != "" and p_p != "":
                    prod_tab = Products(product_name=p_n, product_qty=p_q, product_price=p_p, customer_id=custo_id)
                    prod_tab.save()
                i=i+1