在聚合查询集时,我注意到如果以前使用注释,则会得到错误的结果。我不明白为什么。
代码
from django.db.models import QuerySet, Max, F, ExpressionWrapper, DecimalField, Sum
from orders.models import OrderOperation
class OrderOperationQuerySet(QuerySet):
def last_only(self) -> QuerySet:
return self \
.annotate(last_oo_pk=Max('order__orderoperation__pk')) \
.filter(pk=F('last_oo_pk'))
@staticmethod
def _hist_price(orderable_field):
return ExpressionWrapper(
F(f'{orderable_field}__hist_unit_price') * F(f'{orderable_field}__quantity'),
output_field=DecimalField())
def ordered_articles_data(self):
return self.aggregate(
sum_ordered_articles_amounts=Sum(self._hist_price('orderedarticle')))
测试
qs1 = OrderOperation.objects.filter(order__pk=31655)
qs2 = OrderOperation.objects.filter(order__pk=31655).last_only()
assert qs1.count() == qs2.count() == 1 and qs1[0] == qs2[0] # shows that both querysets contains the same object
qs1.ordered_articles_data()
> {'sum_ordered_articles_amounts': Decimal('3.72')} # expected result
qs2.ordered_articles_data()
> {'sum_ordered_articles_amounts': Decimal('3.01')} # wrong result
这种last_only
注释方法如何使聚合结果不同(和错误)?
“有趣”的事情似乎只有在订单包含具有相同hist_price
的商品时才会发生:
旁注
last_only()
然后在第二个查询中调用聚合时,它会按预期工作。 SQL查询
(请注意,这些是实际的查询,但是上面的代码已稍作简化,这解释了下面COALESCE
和"deleted" IS NULL
的出现。)
-qs1.ordered_articles_data()
SELECT
COALESCE(
SUM(
("orders_orderedarticle"."hist_unit_price" * "orders_orderedarticle"."quantity")
),
0) AS "sum_ordered_articles_amounts"
FROM "orders_orderoperation"
LEFT OUTER JOIN "orders_orderedarticle"
ON ("orders_orderoperation"."id" = "orders_orderedarticle"."order_operation_id")
WHERE ("orders_orderoperation"."order_id" = 31655 AND "orders_orderoperation"."deleted" IS NULL)
-qs2.ordered_articles_data()
SELECT COALESCE(SUM(("__col1" * "__col2")), 0)
FROM (
SELECT
"orders_orderoperation"."id" AS Col1,
MAX(T3."id") AS "last_oo_pk",
"orders_orderedarticle"."hist_unit_price" AS "__col1",
"orders_orderedarticle"."quantity" AS "__col2"
FROM "orders_orderoperation" INNER JOIN "orders_order"
ON ("orders_orderoperation"."order_id" = "orders_order"."id")
LEFT OUTER JOIN "orders_orderoperation" T3
ON ("orders_order"."id" = T3."order_id")
LEFT OUTER JOIN "orders_orderedarticle"
ON ("orders_orderoperation"."id" = "orders_orderedarticle"."order_operation_id")
WHERE ("orders_orderoperation"."order_id" = 31655 AND "orders_orderoperation"."deleted" IS NULL)
GROUP BY
"orders_orderoperation"."id",
"orders_orderedarticle"."hist_unit_price",
"orders_orderedarticle"."quantity"
HAVING "orders_orderoperation"."id" = (MAX(T3."id"))
) subquery
答案 0 :(得分:1)
当您使用数据库语言(LayoutInflater)中的任何annotation
时,都应按功能以外的所有字段进行分组,并且可以在子查询中看到它
GROUP BY
"orders_orderoperation"."id",
"orders_orderedarticle"."hist_unit_price",
"orders_orderedarticle"."quantity"
HAVING "orders_orderoperation"."id" = (MAX(T3."id"))
结果,hist_unit_price
和quantity
相同的货物将被最大id
过滤。因此,根据您的屏幕,具有条件排除了chocolate
或cafe
之一。
答案 1 :(得分:0)
使用较小的联接分隔为子查询是一种解决方案,可以防止对子对象进行更多联接时出现问题,可能不需要不必要的独立集合的巨大笛卡尔积或对{{1}的复杂控制}子句中的SQL子句来自查询更多元素的贡献。
解决方案:子查询用于获取最后顺序操作的主键。 一个简单的没有添加联接或组的查询通常不会因为子级上的可能聚集而扭曲。
GROUP BY
测试
def last_only(self) -> QuerySet:
max_ids = (self.values('order').order_by()
.annotate(last_oo_pk=Max('order__orderoperation__pk'))
.values('last_oo_pk')
)
return self.filter(pk__in=max_ids)
执行的SQL :(通过删除应用名称前缀ret = (OrderOperationQuerySet(OrderOperation).filter(order__in=[some_order])
.last_only().ordered_articles_data())
和双引号order_
简化)
"
可以通过将SELECT CAST(SUM((orderedarticle.hist_unit_price * orderedarticle.quantity))
AS NUMERIC) AS sum_ordered_articles_amounts
FROM orderoperation
LEFT OUTER JOIN orderedarticle ON (orderoperation.id = orderedarticle.order_operation_id)
WHERE (
orderoperation.order_id IN (31655) AND
orderoperation.id IN (
SELECT MAX(U2.id) AS last_oo_pk
FROM orderoperation U0
INNER JOIN order U1 ON (U0.order_id = U1.id)
LEFT OUTER JOIN orderoperation U2 ON (U1.id = U2.order_id)
WHERE U0.order_id IN (31655)
GROUP BY U0.order_id
)
)
添加到orders_orderedarticle".id
来修复原始无效SQL,但前提是必须同时使用GROUP BY
和last_only()
。那不是可读的方法。