为了提高效率和优化,我应该使用eager_load还是包含?

时间:2015-10-08 17:48:50

标签: ruby-on-rails ruby-on-rails-4 rails-activerecord eager-loading

目前,我的代码如下:

current_user.association.includes(a: [:b, {c: :d}, {e: :f}]).to_a

在进行通话时,似乎每个包含都是通过自己的SELECT调用数据库来调用的。

但是,当我current_user.association.eager_load(a: [:b, {c: :d}, {e: :f}]).to_a时,我会看到一个巨大的SELECT电话。

我问,因为我以前没有见过这个。我认为eager_load由于较少的DB调用而更有效。

2 个答案:

答案 0 :(得分:3)

由于我无法从您的说明(a: [:b, {c: :d}, {e: :f}])推断出查询,我需要稍微谈谈includes

includes是一种适用于不同情况的查询方法。

以下是一些示例代码:

# model and reference
class Blog < ActiveRecord::Base
  has_many :posts

  # t.string   "name"
  # t.string   "author"
end

class Post < ActiveRecord::Base
  belongs_to :blog

  # t.string   "title"
end

# seed
(1..3).each do |b_id|
  blog = Blog.create(name: "Blog #{b_id}", author: 'someone')
  (1..5).each { |p_id| blog.posts.create(title: "Post #{b_id}-#{p_id}") }
end

在一个案例中,它会触发两个单独的查询,就像preload一样。

> Blog.includes(:posts)
  Blog Load (2.8ms)  SELECT "blogs".* FROM "blogs"
  Post Load (0.7ms)  SELECT "posts".* FROM "posts" WHERE "posts"."blog_id" IN (1, 2, 3)

在另一种情况下,当查询引用的表时,它只会触发一个LEFT OUTER JOIN查询,就像eager_load一样。

> Blog.includes(:posts).where(posts: {title: 'Post 1-1'})
  SQL (0.3ms)  SELECT "blogs"."id" AS t0_r0, "blogs"."name" AS t0_r1, "blogs"."author" AS t0_r2, "blogs"."created_at" AS t0_r3, "blogs"."updated_at" AS t0_r4, "posts"."id" AS t1_r0, "posts"."title" AS t1_r1, "posts"."created_at" AS t1_r2, "posts"."updated_at" AS t1_r3, "posts"."blog_id" AS t1_r4 FROM "blogs" LEFT OUTER JOIN "posts" ON "posts"."blog_id" = "blogs"."id" WHERE "posts"."title" = ?  [["title", "Post 1-1"]]

所以,我想你可能要求includeseager_load的不同部分,这是

为了提高效率和优化,我们应该使用两个单独的查询还是一个LEFT OUTER JOIN查询?

这也让我感到困惑。经过一番挖掘,我发现Fabio Akita的这个article让我信服。以下是一些参考和示例:

  

对于某些情况,怪物外部联接变得比许多较小的查询慢。底线是:通常将怪物连接分成较小的连接似乎更好。这可以避免笛卡尔积过载问题。

     

结果集越长越复杂,这就越重要,因为Rails必须处理的对象越多。分配和解除分配数百或数千个小型重复对象绝非易事。

来自Rails的查询数据示例

> Blog.eager_load(:posts).map(&:name).count
  SQL (0.9ms)  SELECT "blogs"."id" AS t0_r0, "blogs"."name" AS t0_r1, "blogs"."author" AS t0_r2, "blogs"."created_at" AS t0_r3, "blogs"."updated_at" AS t0_r4, "posts"."id" AS t1_r0, "posts"."title" AS t1_r1, "posts"."created_at" AS t1_r2, "posts"."updated_at" AS t1_r3, "posts"."blog_id" AS t1_r4 FROM "blogs" LEFT OUTER JOIN "posts" ON "posts"."blog_id" = "blogs"."id"
 => 3

LEFT OUTER JOIN查询

返回的SQL数据示例
sqlite>  SELECT "blogs"."id" AS t0_r0, "blogs"."name" AS t0_r1, "blogs"."author" AS t0_r2, "blogs"."created_at" AS t0_r3, "blogs"."updated_at" AS t0_r4, "posts"."id" AS t1_r0, "posts"."title" AS t1_r1, "posts"."created_at" AS t1_r2, "posts"."updated_at" AS t1_r3, "posts"."blog_id" AS t1_r4 FROM "blogs" LEFT OUTER JOIN "posts" ON "posts"."blog_id" = "blogs"."id";
1|Blog 1|someone|2015-11-11 15:22:35.015095|2015-11-11 15:22:35.015095|1|Post 1-1|2015-11-11 15:22:35.053689|2015-11-11 15:22:35.053689|1
1|Blog 1|someone|2015-11-11 15:22:35.015095|2015-11-11 15:22:35.015095|2|Post 1-2|2015-11-11 15:22:35.058113|2015-11-11 15:22:35.058113|1
1|Blog 1|someone|2015-11-11 15:22:35.015095|2015-11-11 15:22:35.015095|3|Post 1-3|2015-11-11 15:22:35.062776|2015-11-11 15:22:35.062776|1
1|Blog 1|someone|2015-11-11 15:22:35.015095|2015-11-11 15:22:35.015095|4|Post 1-4|2015-11-11 15:22:35.065994|2015-11-11 15:22:35.065994|1
1|Blog 1|someone|2015-11-11 15:22:35.015095|2015-11-11 15:22:35.015095|5|Post 1-5|2015-11-11 15:22:35.069632|2015-11-11 15:22:35.069632|1
2|Blog 2|someone|2015-11-11 15:22:35.072871|2015-11-11 15:22:35.072871|6|Post 2-1|2015-11-11 15:22:35.078644|2015-11-11 15:22:35.078644|2
2|Blog 2|someone|2015-11-11 15:22:35.072871|2015-11-11 15:22:35.072871|7|Post 2-2|2015-11-11 15:22:35.081845|2015-11-11 15:22:35.081845|2
2|Blog 2|someone|2015-11-11 15:22:35.072871|2015-11-11 15:22:35.072871|8|Post 2-3|2015-11-11 15:22:35.084888|2015-11-11 15:22:35.084888|2
2|Blog 2|someone|2015-11-11 15:22:35.072871|2015-11-11 15:22:35.072871|9|Post 2-4|2015-11-11 15:22:35.087778|2015-11-11 15:22:35.087778|2
2|Blog 2|someone|2015-11-11 15:22:35.072871|2015-11-11 15:22:35.072871|10|Post 2-5|2015-11-11 15:22:35.090781|2015-11-11 15:22:35.090781|2
3|Blog 3|someone|2015-11-11 15:22:35.093902|2015-11-11 15:22:35.093902|11|Post 3-1|2015-11-11 15:22:35.097479|2015-11-11 15:22:35.097479|3
3|Blog 3|someone|2015-11-11 15:22:35.093902|2015-11-11 15:22:35.093902|12|Post 3-2|2015-11-11 15:22:35.103512|2015-11-11 15:22:35.103512|3
3|Blog 3|someone|2015-11-11 15:22:35.093902|2015-11-11 15:22:35.093902|13|Post 3-3|2015-11-11 15:22:35.108775|2015-11-11 15:22:35.108775|3
3|Blog 3|someone|2015-11-11 15:22:35.093902|2015-11-11 15:22:35.093902|14|Post 3-4|2015-11-11 15:22:35.112654|2015-11-11 15:22:35.112654|3
3|Blog 3|someone|2015-11-11 15:22:35.093902|2015-11-11 15:22:35.093902|15|Post 3-5|2015-11-11 15:22:35.117601|2015-11-11 15:22:35.117601|3

我们从Rails获得了预期的结果,但是SQL的结果更大。这就是LEFT OUTER JOIN的效率损失。

所以结论是,includes优先于eager_load

我在研究时结束了一篇关于Preload, Eager_load, Includes, References, and Joins in Rails的博客文章。希望这可以提供帮助。

参考

答案 1 :(得分:0)

因此,事实证明,ActiveRecord实际上曾尝试将所有内容整合到一个查询中,但后来选择它并不是一个好主意。

我通过上面的查询和4000条记录对此进行了探讨。

快速分析:

eager_load耗时2,600毫秒。 包括花了72毫秒。

eager_load的时间是包含的36倍。