Select_related(和prefetch_related)无法正常工作?

时间:2015-09-22 19:59:07

标签: python django django-orm

我的模特:

class Anything(models.Model):
    first_owner = models.ForeignKey(Owner, related_name='first_owner')
    second_owner = models.ForeignKey(Owner, related_name='second_owner')

class Something(Anything):
    one_related = models.ForeignKey(One, related_name='one_related', null=True)
    many_related = models.ManyToManyField(One, related_name='many_related')

class One(models.Model):
    first = models.IntegerField(null=True)
    second = models.IntegerField(null=True)

在我的代码中,我想使用这样的代码对我的数据库做一个小小的总结:

all_owners = Owner.objects.all()
first_selection = []
second_selection = []

objects = {}

for owner in all_owners:
<<0>>    
items = Something.objects.filter(Q(first_owner=owner)|Q(second_owner=owner)).order_by('date').all() 

    #Find owners, who have at least 100 "Something" elements related
    if(items.count() > 100):
        first_selection.append(owner)
        objects[owner] = items

    #Find owners, who have at least 80 "Something" with at least one many_related elements related,
    if(items.filter(many_related__isnull=False).distinct().count() > 80):
        second_selection.append(owner)
        objects[owner] = items

# Now i pass first_selection and second_selection and objects to functions, but following loops will produce the same problem im getting:

<<1>>
for owner in first_selection:
    for something in objects[owner]:
        rel = something.one_related
        print(str(rel.first) + "blablabla" + str(rel.second))
<<2>>
for owner in first_selection:
    for something in objects[owner]:
        rel = something.one_related
        print(str(rel.first) + "blablabla" + str(rel.second))

<<3>>
for owner in second_selection:
    for something in objects[owner]:
        rel = something.many_related.first()
        if rel != None""
            print(str(rel.first) + "blablabla" + str(rel.second))
<<4>>
for owner in second_selection:
    for something in objects[owner]:
        rel = something.many_related.first()
        if rel != None:
            print(str(rel.first) + "blablabla" + str(rel.second))

问题是: &LT;&LT 1为卤素;&GT;循环需要30分钟才能执行,&lt;&lt; 2&gt;&gt;循环需要2秒才能执行,尽管它们使用相同的数据。
我知道它为什么会发生 - 因为第一个循环获取所有one_related字段并将其存储在缓存中。所以我更改了&lt;&lt;&lt; 0&gt;&gt;中的代码到:

        items = Something.objects.filter(Q(first_owner=owner)|Q(second_owner=owner)).order_by('date').select_related('one_related').all()

当我查看生成的查询时,看起来它会在表上执行连接 但问题仍然存在(第一个循环需要几分钟,第二个循环需要几秒钟),事实上我使用mysqltuner来显示执行的查询数量 - 它会在第一个循环中增长,尽管它不应该... ...

我猜这同样适用于第3和第4循环以及prefetch_related,尽管我没有足够的内存来测试它。

1 个答案:

答案 0 :(得分:0)

所以,我知道没有参数调用的select_related()不会预取可以为空的对象。我不知道,在调用select_related('one_related')之后,如果它的字段可以为空,它只会选择相关对象的id。

总结一下,我的问题的答案是替换:

Something.objects.select_related('one_related')

Something.objects.select_related('one_related', 'one_related__first', 'one_related__second')