Rails使用activerecord查询限制每组组

时间:2014-01-17 09:41:43

标签: ruby-on-rails ruby postgresql activerecord group-by

我的问题与此类似 - Active Record LIMIT within GROUP_BY

我想将ActiveSupport::OrderedHash限制为每个网站的特定计数(100)。

限制可以是100或4

为简单起见,我认为它是4

Session.website_only.during(date_range)
       .count(group: [:site_id, :referrer_host],
              order: 'count_all DESC',
              limit: 4)

生成的SQL查询看起来像

SELECT COUNT(*) AS count_all, site_id AS site_id,
referrer_host AS referrer_host FROM "sessions"
WHERE "sessions"."created_at" >= '2013-12-09 00:00:00.000000'
AND "sessions"."created_at" <= '2013-12-16 23:59:59.999999' AND
(referrer_host IS NOT NULL)
AND (("sessions"."referrer_host" NOT ILIKE '%google.com%'
AND "sessions"."referrer_host" NOT ILIKE '%yahoo.com%'
AND "sessions"."referrer_host" NOT ILIKE '%bing.com%' 
AND "sessions"."referrer_host" NOT ILIKE '%aol.com%')) 
AND (("sessions"."referrer_host" NOT ILIKE '%twitter.com%' 
AND "sessions"."referrer_host" NOT ILIKE '%facebook.com%' 
AND "sessions"."referrer_host" NOT ILIKE '%linkedin.com%' 
AND "sessions"."referrer_host" NOT ILIKE '%fb.me%')) 
GROUP BY "sessions"."site_id", "sessions"."referrer_host" 
ORDER BY count_all DESC LIMIT 4

更新的问题


我得到了什么

ActiveSupport :: Ordered Hash,其中包含分组为site_idreferrer_host的网站的所有会话的计数

这是实际结果的一个例子,它有分组的哈希但是在整个集合上有限,我想要的是分组应该仅限于100个。

{[1, "https"]=>8769, [1, "www.example.com"]=>2359, [1, "www.xyz.com"]=>1935, [1, "www.bayers.com"]=>379, 
[2, "www.ruby.com"]=>1322, [2, "www.employment.com"]=>472, [2, "https"]=>424, 
[3, "www.rails.com"]=>424, [3, "www.arizona.net"]=>392, [3, "www.murphy.com"]=>390, 
[4, "www.associates.com"]=>374, [4, "www.reddit.com"]=>365, [4, "www.razorshape.com"]=>352, 
[5, "www.rediff.com"]=>337, [5, "www.tumbleweed.com"]=>327, [5, "www.arizona.com"]=>289, 
[6, "https"]=>275, [131, "www.example.com"]=>253, [6, "www.murphy.com"]=>236, [6, "www.associates.com"]=>227}

我想要什么

而不是每个组中的每个任意数量的集合,我想将其限制为4。

1 个答案:

答案 0 :(得分:2)

我认为没有办法在数据库中执行此操作而不计算所有行的值然后进行过滤。所以在这种情况下,我宁愿用红宝石过滤它,这使得代码更清晰。像这样:

data = {[1, "https"]=>8769, [1, "www.example.com"]=>2359, [1, "www.xyz.com"]=>1935, [1, "www.bayers.com"]=>379, 
[2, "www.ruby.com"]=>1322, [2, "www.employment.com"]=>472, [2, "https"]=>424, 
[3, "www.rails.com"]=>424, [3, "www.arizona.net"]=>392, [3, "www.murphy.com"]=>390, 
[4, "www.associates.com"]=>374, [4, "www.reddit.com"]=>365, [4, "www.razorshape.com"]=>352, 
[5, "www.rediff.com"]=>337, [5, "www.tumbleweed.com"]=>327, [5, "www.arizona.com"]=>289, 
[6, "https"]=>275, [131, "www.example.com"]=>253, [6, "www.murphy.com"]=>236, [6, "www.associates.com"]=>227}

limit = 4  # or 100
#Create a hash that has arrays on the keys
counts = Hash.new(0)
result = Hash.new

data.each do |k, v|
  site = k[0]
  if counts[site] < limit
    counts[site]+=1
    result[k]=v
  end
end

puts counts
puts result

counts结构的最终格式与data结构的格式不完全相同,但可以轻松转换回来。正在运行的代码可以在http://rubyfiddle.com/riddles/26fc3/2

中找到