我有一个Sleep
模型,其实例belongs_to
是Person
模型的实例。我想将统计计算传递给后台线程。人们自我报告他们的数据,并可能会跳过几天。
我创建了一个Sleepstat
模型,并计划计算每天有一个或多个记录的Sleep
实例的统计信息。但是,人们可以稍后返回并编辑他们的数据,因此在此后台任务中,我想扫描Sleepstat
的现有实例以确定needs_updating
标志的状态。
如果有人在某天没有现有Sleep
的情况下创建Sleepstat
记录,那么我希望后台任务创建Sleepstat
并计算当天的统计信息。如果有人在现有Sleep
的日期添加了额外的Sleepstat
记录,那么我想将Sleepstat
标记为需要更新并使用新数据更新,以保持统计信息-至今。
我的想法是做以下事情:
运行查询以返回属于相关Sleep
的所有Person
条记录。为此,我使用了这个查询,它按照我的预期运行:
all_sleeps = Sleep.select('start_time,end_time,multiday,time_zone,in_progress').where(:person_id => self.id)
创建唯一start_time
日期的数组:
days_recorded = []
for sleep in all_sleeps
days_recorded.push sleep.start_time.to_date
end
days_recorded = days_recorded.uniq
对于days_recorded
中的每一个,请查看是否存在Sleepstat
。如果没有,请创建一个并计算统计数据。如果是,请检查是否needs_updating
。如果是,请计算统计数据。如果没有,请转到days_recorded
中的下一个项目。
days_recorded.each do |d|
stat = Sleepstat.where(:date => d).first
if stat.nil?
# No record, so create one because we have data for that day and calculate stats
...
else
# There is a record. Evaluate whether it needs to be updated
if stat.needs_updating?
# Update the Sleepstat
...
end
end
end
这种方法导致了许多独立的查询:
Sleepstat Load (0.2ms) SELECT "sleepstats".* FROM "sleepstats" WHERE "sleepstats"."date" = '2011-12-10'
Sleepstat Load (0.2ms) SELECT "sleepstats".* FROM "sleepstats" WHERE "sleepstats"."date" = '2011-12-11'
Sleepstat Load (0.2ms) SELECT "sleepstats".* FROM "sleepstats" WHERE "sleepstats"."date" = '2011-12-12'
Sleepstat Load (0.2ms) SELECT "sleepstats".* FROM "sleepstats" WHERE "sleepstats"."date" = '2011-12-13'
我的想法是先尝试通过以下查询抓取所有Sleepstat
:
existing_stats = Sleepstat.where(:date => days_recorded)
然后在步骤3中迭代它们。我的尝试看起来像:
existing_stats = Sleepstat.where(:date => days_recorded)
days_recorded.each do |d|
stat = existing_stats.where(:date => d)
if stat.nil? || stat.length == 0
# No record, so create one because we have data for that day and calculate stats
...
else
# There is a record. Evaluate whether it needs to be updated
if stat.needs_updating?
# Update the Sleepstat
...
end
end
end
这导致了许多更复杂的个人查询:
Sleepstat Load (0.5ms) SELECT "sleepstats".* FROM "sleepstats" WHERE "sleepstats"."date" IN ('2011-12-07', '2011-12-06', '2011-12-08', '2011-12-09', '2011-12-10', '2011-12-11', '2011-12-12', '2011-12-13', '2011-12-14', '2011-12-15') AND "sleepstats"."date" = '2011-12-10'
Sleepstat Load (0.9ms) SELECT "sleepstats".* FROM "sleepstats" WHERE "sleepstats"."date" IN ('2011-12-07', '2011-12-06', '2011-12-08', '2011-12-09', '2011-12-10', '2011-12-11', '2011-12-12', '2011-12-13', '2011-12-14', '2011-12-15') AND "sleepstats"."date" = '2011-12-11'
Sleepstat Load (0.6ms) SELECT "sleepstats".* FROM "sleepstats" WHERE "sleepstats"."date" IN ('2011-12-07', '2011-12-06', '2011-12-08', '2011-12-09', '2011-12-10', '2011-12-11', '2011-12-12', '2011-12-13', '2011-12-14', '2011-12-15') AND "sleepstats"."date" = '2011-12-12'
Sleepstat Load (0.4ms) SELECT "sleepstats".* FROM "sleepstats" WHERE "sleepstats"."date" IN ('2011-12-07', '2011-12-06', '2011-12-08', '2011-12-09', '2011-12-10', '2011-12-11', '2011-12-12', '2011-12-13', '2011-12-14', '2011-12-15') AND "sleepstats"."date" = '2011-12-13'
如何提高此过程的效率,以免我多次访问数据库?
答案 0 :(得分:1)
如果您的统计数据不是太重而无法计算,那么最好使用callback 每次创建或更新记录时计算统计数据:
Class Sleep < ActiveRecord::Base
before_save :create_or_update_stats
def create_or_update_stats
# avoid calculation if record is new or if nothing changed
return unless ( self.new_record? || self.changed? )
date = self.start_time.to_date
stats = Sleepstat.find_or_create_by_date( date )
sleeps = Sleep.where( date: date )
# now calculate the stats and save them.
end
end
编辑当然,你也必须在destroy上添加一个回调。你得到了精神。
额外提示:
for
语法。它在内部调用each
,为什么要这么麻烦? 这与你在2)中所做的相同:
all_sleeps.map{|s| s.start_date.to_date }.uniq
# or even this
all_sleeps.map( &:start_date ).map( &:to_date ).uniq
测试关系是否为空,请使用stat.exists?
代替您的陈述