Question

我正在试图找出某些类型文章的数量。我的查询非常低效：

Article.where(status: 'Finished').select{|x| x.tags & Article::EXPERT_TAGS}.size

在我寻求成为更好的程序员的过程中，我想知道如何使这个更快速的查询。 tags是Article中的字符串数组，Article::EXPERT_TAGS是另一个字符串数组。我想找到数组的交集，并得到结果记录计数。

编辑：Article::EXPERT_TAGS和article.tags被定义为Mongo数组。这些数组包含字符串，我相信它们是序列化的字符串。例如：Article.first.tags = ["Guest Writer", "News Article", "Press Release"]。不幸的是，这不是一个单独的标签表。

第二次编辑：我正在使用MongoDB，所以实际上它使用MongoWrapper，如MongoMapper或mongoid，而不是ActiveRecord。对我来说这是一个错误，对不起！由于这个错误，它搞砸了这个问题的分析。感谢PinnyM指出错误！

Answer 1

由于您使用的是MongoDB，因此您还可以考虑使用特定于MongoDB的数组交集解决方案（聚合框架），以便在获取最终结果之前让数据库完成所有工作。

请参阅此SO帖子How to check if an array field is a part of another array in MongoDB?

Answer 2

假设整个tags列表存储在单个数据库字段中并且您希望保持这种方式，我认为没有太大的改进范围，因为您需要将所有数据都存入Ruby用于处理。

但是，数据库查询存在一个问题

Article.where(status: 'Finished')

# This translates into the following query
SELECT * FROM articles WHERE status = 'Finished'

基本上，您正在获取所有列，而您只需要tags列用于您的流程。因此，您可以像这样使用pluck：

Article.where(status: 'Finished').pluck(:tags)

# This translates into the following query
SELECT tags FROM articles WHERE status = 'Finished'

Answer 3

我在ActiveRecord here中回答了有关一般交叉点问题的问题。

摘录如下：

以下是我用于在ActiveRecord中构建交叉类似查询的一般方法：

class Service < ActiveRecord::Base
  belongs_to :person

  def self.with_types(*types)
    where(service_type: types)
  end
end

class City < ActiveRecord::Base
  has_and_belongs_to_many :services
  has_many :people, inverse_of: :city
end

class Person < ActiveRecord::Base
  belongs_to :city, inverse_of: :people

  def self.with_cities(cities)
    where(city_id: cities)
  end

  # intersection like query
  def self.with_all_service_types(*types)
    types.map { |t|
      joins(:services).merge(Service.with_types t).select(:id)
    }.reduce(scoped) { |scope, subquery|
      scope.where(id: subquery)
    }
  end
end

Person.with_all_service_types(1, 2)
Person.with_all_service_types(1, 2).with_cities(City.where(name: 'Gold Coast'))

它将生成以下形式的SQL：

SELECT "people".*
  FROM "people"
 WHERE "people"."id" in (SELECT "people"."id" FROM ...)
   AND "people"."id" in (SELECT ...)
   AND ...

只要每个子查询返回其结果集中匹配人员的id，您就可以根据任何条件/连接等使用上述方法根据需要创建任意数量的子查询。

每个子查询结果集将被“和”在一起，从而将匹配集限制为所有子查询的交集。

ActiveRecord查询数组交集？

3 个答案: