如何从多个级别的集合/外键中获取查询集合?

时间:2018-10-30 22:22:10

标签: python django

如果A包含一组B,而B包含一组C,那么我正在寻找一种以A开头并以C查询集结束的方法。

一个简单的例子:

class Book(models.Model):
    name = models.CharField(max_length=64)


class Page(models.Model):
    number = models.IntegerField()
    book = models.ForeignKey(Book)


class Paragraph(models.Model):
    number = models.IntegerField()
    page = models.ForeignKey(Page)


def query():
    books = Book.objects.all()\
       .prefetch_related('page_set', 'page_set__paragraph_set')

    for book in books:
        pages = book.page_set

        # I need to do something like this
        paragraphs = pages.all().paragraph_set
        # invalid

        # or
        paragraphs = book.page_set.select_related('paragraph_set')
        # valid, but paragraphs is still a QuerySet of Pages

        # this works, but results in one query for EVERY book,
        # which is what I need to avoid
        paragraphs = Paragraph.objects.filter(page__book=book)


        # do stuff with the book
        #...


        # do stuff with the paragraphs in the book
        # ...

如何仅从Book实例中获取一组段落查询?

用于Django查询的命名args语法支持无限级的集合/外键关系嵌套,但是我找不到从上到下使用ORM映射实际获取相关查询集的方法。 / p>

从下至上获取查询集会否定prefetch_related / select_related的好处。

上面的示例是我需要在应用程序中执行的操作的简化版本。该数据库有成千上万的“书”,必须避免任何n + 1个查询。

我发现有question关于在多个级别上使用预取的信息,但是答案并没有解决如何实际获取所获取的查询集以供使用。

2 个答案:

答案 0 :(得分:1)

完成预取之后,似乎唯一便宜的访问子记录的方法是通过all()。任何过滤器似乎都会触发另一个数据库查询。

关于书中所有段落的问题的简短答案是使用具有两个层次的列表理解:

    paragraphs = [paragraph
                  for page in book.page_set.all()
                  for paragraph in page.paragraph_set.all()]

这是一个可运行的示例:

# Tested with Django 1.11.13
from __future__ import print_function
import os
import sys

import django
from django.apps import apps
from django.apps.config import AppConfig
from django.conf import settings
from django.core.files.base import ContentFile, File
from django.db import connections, models, DEFAULT_DB_ALIAS
from django.db.models.base import ModelBase

from django_mock_queries.mocks import MockSet, mocked_relations

NAME = 'udjango'


def main():
    setup()

    class Book(models.Model):
        name = models.CharField(max_length=64)

    class Page(models.Model):
        number = models.IntegerField()
        book = models.ForeignKey(Book)

    class Paragraph(models.Model):
        number = models.IntegerField()
        page = models.ForeignKey(Page)

    syncdb(Book)
    syncdb(Page)
    syncdb(Paragraph)

    b = Book.objects.create(name='Gone With The Wind')
    p = b.page_set.create(number=1)
    p.paragraph_set.create(number=1)
    b = Book.objects.create(name='The Three Body Problem')
    p = b.page_set.create(number=1)
    p.paragraph_set.create(number=1)
    p.paragraph_set.create(number=2)
    p = b.page_set.create(number=2)
    p.paragraph_set.create(number=1)
    p.paragraph_set.create(number=2)

    books = Book.objects.all().prefetch_related('page_set',
                                                'page_set__paragraph_set')

    for book in books:
        print(book.name)
        paragraphs = [paragraph
                      for page in book.page_set.all()
                      for paragraph in page.paragraph_set.all()]
        for paragraph in paragraphs:
            print(paragraph.page.number, paragraph.number)


def setup():
    DB_FILE = NAME + '.db'
    with open(DB_FILE, 'w'):
        pass  # wipe the database
    settings.configure(
        DEBUG=True,
        DATABASES={
            DEFAULT_DB_ALIAS: {
                'ENGINE': 'django.db.backends.sqlite3',
                'NAME': DB_FILE}},
        LOGGING={'version': 1,
                 'disable_existing_loggers': False,
                 'formatters': {
                    'debug': {
                        'format': '%(asctime)s[%(levelname)s]'
                                  '%(name)s.%(funcName)s(): %(message)s',
                        'datefmt': '%Y-%m-%d %H:%M:%S'}},
                 'handlers': {
                    'console': {
                        'level': 'DEBUG',
                        'class': 'logging.StreamHandler',
                        'formatter': 'debug'}},
                 'root': {
                    'handlers': ['console'],
                    'level': 'WARN'},
                 'loggers': {
                    "django.db": {"level": "DEBUG"}}})
    app_config = AppConfig(NAME, sys.modules['__main__'])
    apps.populate([app_config])
    django.setup()
    original_new_func = ModelBase.__new__

    @staticmethod
    def patched_new(cls, name, bases, attrs):
        if 'Meta' not in attrs:
            class Meta:
                app_label = NAME
            attrs['Meta'] = Meta
        return original_new_func(cls, name, bases, attrs)
    ModelBase.__new__ = patched_new


def syncdb(model):
    """ Standard syncdb expects models to be in reliable locations.

    Based on https://github.com/django/django/blob/1.9.3
    /django/core/management/commands/migrate.py#L285
    """
    connection = connections[DEFAULT_DB_ALIAS]
    with connection.schema_editor() as editor:
        editor.create_model(model)

main()

这是输出的结尾。您可以看到它仅对每个表运行一个查询。

2018-10-30 15:58:25[DEBUG]django.db.backends.execute(): (0.000) SELECT "udjango_book"."id", "udjango_book"."name" FROM "udjango_book"; args=()
2018-10-30 15:58:25[DEBUG]django.db.backends.execute(): (0.000) SELECT "udjango_page"."id", "udjango_page"."number", "udjango_page"."book_id" FROM "udjango_page" WHERE "udjango_page"."book_id" IN (1, 2); args=(1, 2)
2018-10-30 15:58:25[DEBUG]django.db.backends.execute(): (0.000) SELECT "udjango_paragraph"."id", "udjango_paragraph"."number", "udjango_paragraph"."page_id" FROM "udjango_paragraph" WHERE "udjango_paragraph"."page_id" IN (1, 2, 3); args=(1, 2, 3)
Gone With The Wind
1 1
The Three Body Problem
1 1
1 2
2 1
2 2

答案 1 :(得分:1)

除了唐回答,您可以使用Prefetch对象来应用所需的任何过滤器,例如:

from django.db import models, connection

def query():
    paragraph_filter = models.Prefetch(
        'page_set__paragraph_set',
        Paragraph.objects.filter(number__gt=1))

    books = Book.objects.all().prefetch_related(
        'page_set', paragraph_filter)

    for book in books:
        for page in book.page_set.all():
            for paragraph in page.paragraph_set.all():
                print(paragraph)

    print(connection.queries)

Django负责确保在少量查询(每个表一个,因此您将获得三个查询)中加载所有适当的对象