如何使用postgres AVG()函数和Rails中的order子句

时间:2016-08-31 16:26:20

标签: sql ruby-on-rails ruby postgresql rails-activerecord

我有一个模型CompanyCompany有很多DailyDataDailyDatavolumedate

计算我写的最近10个工作日的平均数量:

class Array
  def sum
    inject(0) { |result, el| result + el }
  end

  def mean
    sum.to_d / size
  end
end

company = Company.first
company.daily_data.order(date: :desc).limit(10).pluck(:volume).mean

此代码工作正常,但我想使用postgres AVG()函数。

company.daily_data.select('AVG(volume) as average_volume').order(date: :desc)

此代码最终会出现错误:

PG::GroupingError: ERROR:  column "daily_data.date" must appear in the GROUP BY clause or be used in an aggregate function

但如果我将.group(:date)放在方法链中,则sql会重新生成多个结果。 如何使用postgresql AVG()函数获得最近的10卷平均值?

1 个答案:

答案 0 :(得分:2)

An ActiveRecord query like this:

company.daily_data.select('AVG(volume) as average_volume').order(date: :desc)

doesn't really make much sense. avg is an aggregate function in SQL so it needs to operate on a groups of rows. But you're not telling the database how to group the rows, you're telling the database to compute the average volume over the entire table and then order that single value by something that doesn't exist in the final result set.

Throwing a limit in:

company.daily_data
       .select('AVG(volume) as average_volume')
       .order(date: :desc)
       .limit(10)

won't help either because the limit is applied after the order and by then you've already confused the database with your attempted avg(volume).

I'd probably use a derived table if I was doing this in SQL, something like:

select avg(volume) as average_volume
from (
  select volume
  from where_ever...
  where what_ever...
  order by date desc
  limit 10
) dt

The derived table in the FROM clause finds the volumes that you want and then the overall query averages those 10 volumes.

Alternatively, you could use a subquery to grab the rows of interest:

select avg(volume) as average_volume
from where_ever...
where id in (
  select id
  from where_ever...
  where what_ever...
  order by date desc
  limit 10
)

The subquery approach is fairly straight forward to implement with ActiveRecord, something like this:

ten_most_recent = company.daily_data.select(:id).order(:date => :desc).limit(10)
company.daily_data.where(:id => ten_most_recent).average(:volume)

If you throw a to_sql call on the end of the second line you should see something that looks like the subquery SQL.

You can also make the derived table approach work with ActiveRecord but it is a little less natural. There is a from method in ActiveRecord that will take an ActiveRecord query to build the from (select ...) derived table construction but you'll want to be sure to manually name the derived table:

ten_most_recent = company.daily_data.select(:volume).order(:date => :desc).limit(10)
AnyModelAtAll.from(ten_most_recent, 'dt').average('dt.volume')

You have to use a string argument to average and include the dt. prefix to keep ActiveRecord from trying to add its own table name.

Of course you'd hide all this stuff in a method somewhere so that you could hide the details. Perhaps an extension method on the daily_data association:

has_many :daily_data, ... do
  def average_volume(days)
    #...
  end
end

so that you could say things like:

company.daily_data.average_volume(11)