如何在循环中消除django查询的低效率?

时间:2017-07-04 05:27:35

标签: django performance django-queryset

如何使以下代码更有效(例如,如何用查询替换循环)?

def get_question(datetime_now, questions_queryset, user):
    best_schedule = None
    best_question = None
    # HOW TO ELIMINATE THE FOLLOWING LOOP AND REPLACE WITH A QUERY?
    for question in questions_queryset:
        try:
            schedule = (Schedule.objects
                        .filter(question=question, user=user)
                        .latest(field_name='datetime_added')
        except ObjectDoesNotExist:
            schedule = None
        if (schedule and (schedule.date_show_next >= datetime_now) and
                ((not best_schedule) or
                 (schedule.datetime_added >= best_schedule.datetime_added))):
            best_schedule = schedule
            best_question = question

    return best_question



models.py

from django.contrib.auth.models import User
from django.db.models import DateTimeField, ForeignKey, Model, TextField

class Question(Model):
    question = TextField()

class Schedule(Model):
    datetime_added = DateTimeField(auto_now_add=True)
    datetime_show_next = DateTimeField(null=True, default=None)
    question = ForeignKey(Question)
    user = ForeignKey(User, null=True)

2 个答案:

答案 0 :(得分:5)

您可以在此回答中使用Subquery https://stackoverflow.com/a/43883397/3627387或使用Prefetch https://stackoverflow.com/a/31237026/3627387

以下是使用Prefetch实现此目的的一种方法:

schedules_prefetch = Prefetch(
        'schedule_set',
        queryset=Schedule.objects.filter(user=user))
for question in questions_queryset.prefetch_related(schedules_prefetch):
    try:
        # using max here so it wouldn't do another DB hit
        schedule = max(question.schedule_set.all(),
                       key=lambda x: x.datetime_added)
    except ValueError:
        schedule = None

以下是使用Subquery的示例(它可能实际上不起作用,但会给您一般的想法):

from django.db.models import OuterRef, Subquery
schedules = (Schedule.objects
             .filter(user=user, question=OuterRef('pk'))
             .order_by('datetime_added'))
questions_queryset = (questions_queryset
                    .annotate(latest_schedule=Subquery(schedules[:1])))
for question in questions_queryset:
    schedule = question.latest_schedule

答案 1 :(得分:2)

    # Get the question ids    
    question_ids = questions_queryset.values_list('id', flat=True)

    # get the beloved shedule
    schedule = Schedule.objects.filter(question__in=question_ids, user=user).latest(field_name='datetime_added')

    # You may opt for Schedule.objects.get() so as not to run into
    # the problem of multiple objects returned if all you need is strictly one schedule