重用子查询以在Django ORM中进行排序

时间:2019-06-27 10:53:35

标签: django postgresql django-models

我开了一家狗沙龙,狗很少去理发。为了鼓励所有者返回,我想为下次访问发送凭证。凭证将基于狗在过去2个月至2年内是否理发过。超过2年前,我们可以假设该客户已丢失,并且不到2个月前距离该客户之前的剪发太近了。我们将首先定位最近访问过的所有者。

我的基础数据库是PostgreSQL。

from datetime import timedelta
from django.db import models
from django.db.models import Max, OuterRef, Subquery
from django.utils import timezone


# Dogs have one owner, owners can have many dogs, dogs can have many haircuts

class Owner(models.model):
    id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
    name = models.CharField(max_length=255)


class Dog(models.model):
    id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
    owner = models.ForeignKey(Owner, on_delete=models.CASCADE, related_name="dogs")
    name = models.CharField(max_length=255)


class Haircut(models.model):
    id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
    dog = models.ForeignKey(Dog, on_delete=models.CASCADE, related_name="haircuts")
    at = models.DateField()


today = timezone.now().date()
start = today - timedelta(years=2)
end = today - timedelta(months=2)

令我震惊的是,该问题可以分解为两个查询。首先是在最近两个月至两年内聚集所有者的狗的最新信息。

dog_aggregate = Haircut.objects.annotate(Max("at")).filter(at__range=(start, end))

然后将其结果连接到owners表。

owners_by_shaggiest_dog_1 = Owner.objects # what's the rest of this?

产生类似于以下内容的SQL

select
  owner.id,
  owner.name
from
  (
    select
      dog.owner_id,
      max(haircut.at) last_haircut
    from haircut
      left join dog on haircut.dog_id = dog.id
    where
      haircut.at
        between current_date - interval '2' year
            and current_date - interval '2' month
    group by
      dog.owner_id
  ) dog_aggregate
  left join owner on dog_aggregate.owner_id = owner.id
order by
  dog_aggregate.last_haircut asc,
  owner.name;

通过一些游戏,我设法获得了正确的结果:

haircut_annotation = Subquery(
    Haircut.objects
    .filter(dog__owner=OuterRef("pk"), at__range=(start, end))
    .order_by("-at")
    .values("at")[:1]
)

owners_by_shaggiest_dog_2 = (
    Owner.objects
    .annotate(last_haircut=haircut_annotation)
    .order_by("-last_haircut", "name")
)

但是,由于对每一行都执行了新查询,因此生成的SQL似乎效率很低:

select
  owner.id,
  owner.name,
  (
    select
    from haircut
      inner join dog on haircut.dog_id = dog.id
    where haircut.at
            between current_date - interval '2' year
                and current_date - interval '2' month
      and dog.owner_id = (owner.id)
    order by
      haircut.at asc
    limit 1
  ) last_haircut
from
  owner
order by
  last_haircut asc,
  owner.name;

P.S。我实际上没有经营狗沙龙,所以我不能给您代金券。抱歉!

1 个答案:

答案 0 :(得分:1)

鉴于我理解正确,您可以进行如下查询:

from django.db.models import Max

Owners.objects.filter(
    dogs__haircuts__at__range=(start, end)
).annotate(
    last_haircut=Max('dogs__haircuts__at')
).order_by('last_haircut', 'name')

最后一次理发应该是此处的Max最大值,因为随着时间的流逝,时间戳会更大。

但是请注意,您的查询和此查询并不排除最近被洗过的狗的主人。在计算last_haircut时,我们只是没有考虑到这一点。

如果要排除此类所有者,则应建立类似以下的查询:

from django.db.models import Max

Owners.objects.exclude(
    dogs__haircuts__at__gt=end
).filter(
    dogs__haircuts__at__range=(start, end)
).annotate(
    last_haircut=Max('dogs__haircuts__at')
).order_by('last_haircut', 'name')