使用Django避免使用笛卡尔积

时间:2013-10-07 08:56:34

标签: python sql django cartesian-product cross-join

我定义了以下三个表。

class Operator(models.Model):
    DisplayName = models.CharField(max_length=64)

    class Meta:
        app_label = "Experiment"
        db_table = "EXPERIMENT_OPERATOR"

class OperatorSummary(models.Model):
    Operator = models.ForeignKey(Operator, related_name="TransactionSummary")
    TransactionCount = models.IntegerField()
    TransactionValue = models.DecimalField(max_digits=18, decimal_places=2)
    StartTime = models.DateTimeField(default=timezone.now())

    class Meta:
        app_label = "Experiment"
        db_table = "EXPERIMENT_OPERATORSUMMARY"

class OperatorAlerts(models.Model):
    Operator = models.ForeignKey(Operator, related_name="AlertSummary")
    AlertScore = models.IntegerField()
    AlertCount = models.IntegerField()
    StartTime = models.DateTimeField(default=timezone.now())

    class Meta:
        app_label = "Experiment"
        db_table = "EXPERIMENT_OPERATORALERTS"

对于运营商,我想检索给定日期范围的AlertScoreTransactionCount。我正在使用的查询如下所示:

tz = timezone.get_default_timezone()    
vs = Operator.objects.filter(DisplayName="Jimmy",
                             TransactionSummary__StartTime__gte=tz.localize(datetime(year=2013, month=10, day=1)),
                             AlertSummary__StartTime__gte=tz.localize(datetime(year=2013, month=10, day=1)))\
    .annotate(TotalTransactions=Sum("TransactionSummary__TransactionCount"),
              TotalAlerts=Sum("AlertSummary__AlertScore"))\
    .values("DisplayName", "TransactionSummary__TransactionCount", "AlertSummary__AlertScore")

此查询执行笛卡尔积并返回OperatorAlerts和OperatorSummary表中与查询匹配的所有行。这就是它的回报:

{'AlertSummary__AlertScore': 20, 'DisplayName': u'Jimmy', 'TransactionSummary__TransactionCount': 10}
{'AlertSummary__AlertScore': 44, 'DisplayName': u'Jimmy', 'TransactionSummary__TransactionCount': 10}
{'AlertSummary__AlertScore': 543, 'DisplayName': u'Jimmy', 'TransactionSummary__TransactionCount': 10}
{'AlertSummary__AlertScore': 20, 'DisplayName': u'Jimmy', 'TransactionSummary__TransactionCount': 22}
{'AlertSummary__AlertScore': 44, 'DisplayName': u'Jimmy', 'TransactionSummary__TransactionCount': 22}
{'AlertSummary__AlertScore': 543, 'DisplayName': u'Jimmy', 'TransactionSummary__TransactionCount': 22}
{'AlertSummary__AlertScore': 20, 'DisplayName': u'Jimmy', 'TransactionSummary__TransactionCount': 234}
{'AlertSummary__AlertScore': 44, 'DisplayName': u'Jimmy', 'TransactionSummary__TransactionCount': 234}
{'AlertSummary__AlertScore': 543, 'DisplayName': u'Jimmy', 'TransactionSummary__TransactionCount': 234}

我想解决这个问题,以便得到以下结果:

{'AlertSummary__AlertScore': 607, 'DisplayName': u'Jimmy', 'TransactionSummary__TransactionCount': 266}

所有结果都折叠为一行,其中AlertScore和TransactionCount相加。

这可能吗?我总是可以回过头来为OperatorAlerts和OperatorSummary单独查询,然后在Python中迭代结果集以获得我想要的结果或调用.aggregate,但我确定必须有更好的方法吗?

1 个答案:

答案 0 :(得分:1)

尝试颠倒您应用values()annotate()方法的顺序。 values()应该是第一位的:

vs = Operator.objects.filter(DisplayName="Jimmy",
                             TransactionSummary__StartTime__gte=tz.localize(datetime(year=2013, month=10, day=1)),
                             AlertSummary__StartTime__gte=tz.localize(datetime(year=2013, month=10, day=1)))\
    .values("DisplayName")\
    .annotate(TotalTransactions=Sum("TransactionSummary__TransactionCount"),
              TotalAlerts=Sum("AlertSummary__AlertScore"))

这将按values()中提到的字段对结果进行分组,然后为每个组生成注释。订单非常重要 - as documented

按照您的方式应用values()annotate()(例如annotate()之前的values())将分别为每个项目生成注释。

请注意,上面的代码会按DisplayName对结果进行分组。您可能希望按其他字段进行分组,例如pk

此外,我假设在您的实际代码中,您将希望立即获取多个运算符的值。如果您总是一次查询一个操作员(就像您在示例中所做的那样),那么最好使用aggregate()而不是annotate()