带有OuterRef的简单子查询

时间:2017-05-03 21:13:40

标签: python mysql django django-queryset django-database

我正在尝试创建一个使用OuterRef的非常简单的子查询(不是出于实际目的,只是为了让它工作),但是仍然遇到同样的错误。

文章/ models.py

from django.db import models

class Tag(models.Model):
    name = models.CharField(max_length=120)
    def __str__(self):
        return self.name

class Post(models.Model):
    title = models.CharField(max_length=120)
    tags = models.ManyToManyField(Tag)
    def __str__(self):
        return self.title

manage.py shell代码

>>> from django.db.models import OuterRef, Subquery
>>> from posts.models import Tag, Post
>>> tag1 = Tag.objects.create(name='tag1')
>>> post1 = Post.objects.create(title='post1')
>>> post1.tags.add(tag1)
>>> Tag.objects.filter(post=post1.pk)
<QuerySet [<Tag: tag1>]>
>>> tags_list = Tag.objects.filter(post=OuterRef('pk'))
>>> Post.objects.annotate(count=Subquery(tags_list.count()))

最后两行应该为每个Post对象提供标签数量。在这里我一直得到同样的错误:

ValueError: This queryset contains a reference to an outer query and may only be used in a subquery.

2 个答案:

答案 0 :(得分:42)

您的示例的一个问题是您不能将queryset.count()用作子查询,因为.count()会尝试评估查询集并返回计数。

因此,人们可能认为正确的方法是使用Count()代替。也许是这样的:

Post.objects.annotate(
    count=Count(Tag.objects.filter(post=OuterRef('pk')))
)

这不会有两个原因:

  1. Tag查询集选择所有Tag字段,而Count只能依赖一个字段。因此:需要Tag.objects.filter(post=OuterRef('pk')).only('pk')(选择tag.pk上的计数)。

  2. Count本身不是Subquery类,CountAggregate。因此Count生成的表达式无法识别为Subquery,我们可以使用Subquery来解决此问题。

  3. 为1)和2)应用修正将产生:

    Post.objects.annotate(
        count=Count(Subquery(Tag.objects.filter(post=OuterRef('pk')).only('pk')))
    )
    

    <强>然而 如果您检查正在生成的查询

    SELECT 
        "tests_post"."id",
        "tests_post"."title",
        COUNT((SELECT U0."id" 
                FROM "tests_tag" U0 
                INNER JOIN "tests_post_tags" U1 ON (U0."id" = U1."tag_id") 
                WHERE U1."post_id" = ("tests_post"."id"))
        ) AS "count" 
    FROM "tests_post" 
    GROUP BY 
        "tests_post"."id",
        "tests_post"."title"
    

    您可能会注意到我们有GROUP BY条款。这是因为Count是一个Aggregate,现在它不会影响结果,但在其他一些情况下它可能会影响结果。这就是为什么docs建议采用稍微不同的方法,其中聚合通过subquery + values + annotate <的特定组合移入values / p>

    Post.objects.annotate(
        count=Subquery(
            Tag.objects.filter(post=OuterRef('pk'))
                # The first .values call defines our GROUP BY clause
                # Its important to have a filtration on every field defined here
                # Otherwise you will have more than one group per row!!!
                # This will lead to subqueries to return more than one row!
                # But they are not allowed to do that!
                # In our example we group only by post
                # and we filter by post via OuterRef
                .values('post')
                # Here we say: count how many rows we have per group 
                .annotate(count=Count('pk'))
                # Here we say: return only the count
                .values('count')
        )
    )
    

    最后这会产生:

    SELECT 
        "tests_post"."id",
        "tests_post"."title",
        (SELECT COUNT(U0."id") AS "count" 
                FROM "tests_tag" U0 
                INNER JOIN "tests_post_tags" U1 ON (U0."id" = U1."tag_id") 
                WHERE U1."post_id" = ("tests_post"."id") 
                GROUP BY U1."post_id"
        ) AS "count" 
    FROM "tests_post"
    

答案 1 :(得分:0)

django-sql-utils软件包使这种子查询聚合变得简单。只需pip install django-sql-utils,然后:

from sql_util.utils import SubqueryCount
posts = Post.objects.annotate(tag_count=SubqueryCount('tag'))

SubqueryCount的API与Count相同,但是它在SQL中生成子选择,而不是联接到相关表。