如何在Django中使用过滤的自连接注释QuerySet

时间:2019-01-06 00:51:21

标签: django django-models django-orm

动机

我有一个体育比赛结果的数据集,我想用每场比赛之前的这段时间的过去表现来注释。

我想到的方式是:

  1. 为每个Match注释在相关时间段内发生的一组匹配项(下面的MatchManager.matchset_within_period)。
  2. 在这组相关匹配项(以下为Match)中汇总统计信息来注释每个MatchManager.annotate_with_stats

我可以使用一个有点复杂的查询(在下面概述)来做到这一点,该查询涉及一个额外的Dataset模型,该模型我上下移动以获取对整套匹配项的引用,然后,我可以进行过滤和汇总。

问题

这种方法看起来真的很复杂,并且可能对性能不利。对于读者而言,这绝对是很难遵循的(至少是不直观的)。

是否可以直接获得步骤(1)所需的匹配集,而无需额外的模型(例如,在Match上使用Subquery direclty)?


用法示例

In [1]

test_matches = Match.objects.filter(...)

Match.objects \
    .annotate_with_stats(for_days=300) \
    .filter(id__in=test_matches) \
    .values('pk', 'home_team_avg_score')

Out[1]

<MatchQuerySet [{'id': 287, 'home_team_avg_score': 91.04166666666667}, {'id': 288, 'home_team_avg_score': 91.21739130434783}, {'id': 289, 'home_team_avg_score': 92.45833333333333}]>

简化代码

models.py (simplified)

class Team(models.Model):
    name = models.CharField(max_length=255, unique=True)


# This model has no semantic meaning - it's purely for the query
class Dataset(models.Model):
    name = models.CharField(max_length=255, unique=True)


class Season(models.Model):
    dataset = models.ForeignKey(
        Dataset, on_delete=models.CASCADE, related_name='seasons',
    )
    # ...


class Round(models.Model):
    season = models.ForeignKey(
        Season, on_delete=models.CASCADE, related_name='rounds',
    )
    # ...


class Match(models.Model):
    round = models.ForeignKey(
        Round, on_delete=models.CASCADE, related_name='matches',
    )
    home_team = models.ForeignKey(
        Team, on_delete=models.CASCADE, related_name='home_matches',
    )
    date = models.DateTimeField()
    # ...
    objects = MatchManager()


class TeamMatchStats(models.Model):
    match = models.ForeignKey(
        Match, on_delete=models.CASCADE, related_name='team_stats',
    )
    team = models.ForeignKey(
        Team, on_delete=models.CASCADE, related_name='match_stats',
    )
    score = models.IntegerField()
    # ...

managers.py (simplified)

def fm(x):
    '''
    Helper function for obtaining a self-referential matches set.
    '''
    if x.startswith('round'):
        raise ValueError('Cannot re-traverse upwards')
    return f'round__season__dataset__seasons__rounds__matches__{x}'


class MatchQuerySet(models.QuerySet):
    def matchset_within_period(self, td):
        # filter(date__lt): before this match
        # annotate/filter(time_before__lte): within x period
        return self \
                .filter(**{fm('date__lt'): F('date')}) \
                .annotate(
                        time_before=ExpressionWrapper(
                                F('date') - F(fm('date')),
                                output_field=DurationField(),
                        )
                ) \
                .filter(time_before__lte=td) \
                .values('pk')

    def annotate_with_stats(self, for_days):
        q_home_team = Q(**{fm('team_stats__team'): F('home_team')})
        team_avg_params = {
            'home_team_avg_score': Avg(
                fm('team_stats__score'), filter=q_home_team,
            )
        }  # In reality this is a dict comp getting a number of stats

        return self \
                .matchset_within_period(timedelta(days=for_days)) \
                .annotate(**team_avg_params)


MatchManager = MatchQuerySet.as_manager

0 个答案:

没有答案