我需要计算表的平均日期时间,并且我使用Avg的聚合但它返回一个浮点类型数而不是一个datetime对象。这个浮点数究竟代表什么?
最重要的是,如何将其转换为日期时间对象?
答案 0 :(得分:1)
为了进一步参考,我不得不处理类似的问题。考虑模型:
class Championship(models.Model):
...
class Game(models.Model):
date = models.DateField()
championship = models.ForeignKey(Championship)
有些比赛与锦标赛有关,我希望,从这个锦标赛中返回比赛日期的平均值,例如,如果我在1月1日有一场比赛,1月3日有一场比赛,我想1月2日回归。
在postgresql背景下,使用内置的Avg进行聚合是不行的:(因为Avg不是为日期时间字段设计的)
>>> championship.game_set.aggregate(Avg('date'))
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "~/env/local/lib/python2.7/site-packages/django/db/models/manager.py", line 158, in aggregate
return self.get_query_set().aggregate(*args, **kwargs)
File "~/env/local/lib/python2.7/site-packages/django/db/models/query.py", line 359, in aggregate
return query.get_aggregation(using=self.db)
File "~/env/local/lib/python2.7/site-packages/django/db/models/sql/query.py", line 389, in get_aggregation
result = query.get_compiler(using).execute_sql(SINGLE)
File "~/env/local/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 840, in execute_sql
cursor.execute(sql, params)
File "~/env/local/lib/python2.7/site-packages/django/db/backends/util.py", line 41, in execute
return self.cursor.execute(sql, params)
File "~/env/local/lib/python2.7/site-packages/django/db/backends/postgresql_psycopg2/base.py", line 58, in execute
six.reraise(utils.DatabaseError, utils.DatabaseError(*tuple(e.args)), sys.exc_info()[2])
File "~/env/local/lib/python2.7/site-packages/django/db/backends/postgresql_psycopg2/base.py", line 54, in execute
return self.cursor.execute(query, args)
DatabaseError: function avg(date) does not exist
LINE 1: SELECT AVG("games_game"."date") AS "date__avg" FROM "games_g...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
所以我尝试了两个解决方案,一个使用django querysets和python,第二个主要使用原始SQL。
def compute_avg_date(self):
"""
Return the average date of the championship's game set.
Casts dates into time deltas, in order to perform a python mean.
"""
game_set = self.game_set.values_list('date', flat=True)
origin_date = datetime.date.min
try:
return (
sum(
map(lambda date: date-origin_date, game_set),
datetime.timedelta(0))/len(game_set) + origin_date)
except ZeroDivisionError:
return datetime.date.today()
def compute_avg_date_db(self):
"""
Does the same as above but directly in db operations.
"""
try:
return self.game_set.filter(week=week).extra(
select={
'avg_time': 'to_timestamp(avg(extract(epoch from date)))'
}).values_list(
'avg_time', flat=True)[0].date()
except AttributeError:
return datetime.date.today()
我认为db-only版本会更快,所以我做了一个小测试平台。
>>> s = """\
... from championships.models import Championship
... champ = Championship.objects.get(pk=1)
... champ.compute_avg_date_db(10)
... """
>>> s1 = """\
... from championships.models import Championship
... champ = Championship.objects.get(pk=1)
... champ.compute_avg_date(10)
... """
>>> timeit.timeit(stmt=s, number=1000)
8.195073127746582
>>> timeit.timeit(stmt=s1, number=1000)
6.377335071563721
我做了一些其他的类似测试,所有这些都显示了compute_avg_date方法,即使用django querysets和python的方法比原始SQL方法略快。我不是专家,所以如果有人能解释,欢迎发表评论。
答案 1 :(得分:0)
刚刚对我自己的模型进行测试,它似乎是一年中的平均值