如何使用ElasticSearch-Rails查询dsl返回相关关系

时间:2015-04-10 14:55:42

标签: ruby-on-rails ruby-on-rails-4 elasticsearch relationship elasticsearch-rails

我是ElasticSearch的新手,但需要使用它来返回产品列表。请不要包含旧答案的答案或链接,这些答案会引用已弃用的轮胎宝石。

的Gemfile

ruby '2.2.0'
gem 'rails', '4.0.3'
gem 'elasticsearch-model', '~> 0.1.6'
gem 'elasticsearch-rails', '~> 0.1.6'

我有几个关系模型。我在下面列出了关系。

模型和关系

product.rb       包括可搜索的

  belongs_to :family
  belongs_to :collection
  has_many :benefits_products
  has_many :benefits, :through => :benefits_products

  def as_indexed_json(options={})
    as_json(
        include: {:benefits => { :only => [ :id, :name ] },
                  :categories => { :only => [ :id, :name ] } }
    )
  end

collection.rb

  include Searchable

  has_many :products

  def as_indexed_json(options={})
    as_json(
      include: [:products]
    )
  end

family.rb

  include Searchable

  has_many :products

  def as_indexed_json(options={})
    as_json(
      include: [:products]
    )
  end

benefit.rb

  include Searchable

  has_many :benefits_products
  has_many :products, :through => :benefits_products

  def as_indexed_json(options={})
    as_json(
      include: [:products]
    )
  end

Serachable.rb 只是一个问题,包括所有模型中的弹性搜索和回调

module Searchable
  extend ActiveSupport::Concern

  included do
    include Elasticsearch::Model
    include Elasticsearch::Model::Callbacks

    settings index: { number_of_shards: 1, number_of_replicas: 0 } do
      mapping do

        indexes :id, type: 'long'
        indexes :name, type: 'string'
        indexes :family_id, type: 'long'
        indexes :collection_id, type: 'long'
        indexes :created_at, type: 'date'
        indexes :updated_at, type: 'date'

        indexes :benefits, type: 'nested' do
          indexes :id, type: 'long'
          indexes :name, type: 'string'
        end

        indexes :categories, type: 'nested' do
          indexes :id, type: 'long'
          indexes :name, type: 'string'
        end

      end
    end

    def self.search(options={})
      __set_filters = lambda do |key, f|

        @search_definition[:filter][:and] ||= []
        @search_definition[:filter][:and]  |= [f]
      end

      @search_definition = {
        query: {
          filtered: {
            query: {
              match_all: {}
            }
          }
        },
        filter: {}
      }

      if options[:benefits]
        f = { term: { "benefits.id": options[:benefits] } }

        __set_filters.(:collection_id, f)
        __set_filters.(:family_id, f)
        __set_filters.(:categories, f)
      end

      def as_indexed_json(options={})
        as_json(
          include: {:benefits => { :only => [ :id, :name ] },
                    :categories => { :only => [ :id, :name ] } }
        )
      end

      if options[:categories]
        ...
      end

      if options[:collection_id]
        ...
      end

      if options[:family_id]
        ...
      end

      __elasticsearch__.search(@search_definition)
    end

  end
end

ElasticSearch

我将破碎的slu slu细分为各种家庭,收藏品和福利。我能够搜索具有特定系列或集合的产品并返回正确的结果。我也能够将结果返回给一个好处,但它们看起来并不准确。同时搜索多种好处会产生奇怪的结果我想要" AND"所有字段搜索的组合,但我的结果似乎不是" AND"的结果。或"或"。所以这也让我感到困惑。

我将什么传递给Product.search方法以产生预期的结果?

感谢您提供的任何帮助!

修改

我现在已经验证了产品的索引已编入索引。我使用curl -XGET 'http://127.0.0.1:9200/products/_search?pretty=1'生成了一个如下所示的json响应:

{
  "id":4,
  "name":"product name"
  "family_id":16
  "collection_id":6
  "created_at":"2015-04-13T12:49:42.000Z"
  "updated_at":"2015-04-13T12:49:42.000Z"
  "benefits":[
    {"id":2,"name":"my benefit 2"},
    {"id":6,"name":"my benefit 6"},
    {"id":7,"name":"my benefit 7"}
  ],
  "categories":[
    {"id":2,"name":"category 2"}
  ]}
},
{...}

现在我只需要弄清楚如果我想要上面的示例产品,如何在ElasticSearch中搜索具有优势2,6和7的产品。我特意寻找提交到elasticsearch #search方法的语法来获取嵌套" AND"的结果。查询,嵌套查询设置/映射(以确保我没有遗漏任何内容,以及您可以想到的任何其他相关信息对此进行故障排除。

更新过的

可搜索的问题已更新,以反映收到的答案。我翻译了映射json对象以适应elasticsearch-model语法。当我尝试以类似的方式翻译查询时,我仍然会遇到困惑。

第二次更新

我基本上对the elasticsearch-rails example app感兴趣。我更新了searchable.rb来反映这段代码,当我得到结果时,它们不是" AND"执行。当我申请两项福利时,我会从所有有益的产品中获得结果。

1 个答案:

答案 0 :(得分:4)

默认情况下,如果使用动态映射加载数据,则ES会将嵌套对象创建为平面对象,因此会松散各种嵌套属性之间的关系。为了保持正确的关系,我们可以使用nested objectsparent-child关系。

现在我将使用嵌套对象来实现所需的结果:

映射:

PUT /index-3
{
  "mappings": {
    "products":{
      "properties": {
        "id": {
          "type": "long"
        },
        "name":{
          "type": "string"
        },
        "family_id":{
          "type": "long"
        },
        "collection_id":{
          "type": "long"
        },
        "created_at":{
          "type": "date"
        },
        "updated_at":{
          "type": "date"
        },
        "benefits":{
          "type": "nested",
          "include_in_parent": true,
          "properties": {
            "id": {
              "type": "long"
            },
            "name":{
              "type":"string"
            }
          }
        },
        "categories":{
          "type": "nested",
          "include_in_parent": true,
          "properties": {
            "id":{
              "type": "long"
            },
            "name":{
              "type":"string"
            }
          }
        }
      }
    }
  }
}

如果您观察到我已将子对象视为嵌套映射并包含在父对象中。

现在有一些示例数据:

PUT /index-3/products/4
{
  "name":"product name 4",
  "family_id":15,
  "collection_id":6,
  "created_at":"2015-04-13T12:49:42.000Z",
  "updated_at":"2015-04-13T12:49:42.000Z",
  "benefits":[
    {"id":2,"name":"my benefit 2"},
    {"id":6,"name":"my benefit 6"},
    {"id":7,"name":"my benefit 7"}
  ],
  "categories":[
    {"id":2,"name":"category 2"}
  ]
}
PUT /index-3/products/5
{
  "name":"product name 5",
  "family_id":16,
  "collection_id":6,
  "created_at":"2015-04-13T12:49:42.000Z",
  "updated_at":"2015-04-13T12:49:42.000Z",
  "benefits":[
    {"id":5,"name":"my benefit 2"},
    {"id":6,"name":"my benefit 6"},
    {"id":7,"name":"my benefit 7"}
  ],
  "categories":[
    {"id":3,"name":"category 2"}
  ]
}
PUT /index-3/products/6
{
  "name":"product name 6",
  "family_id":15,
  "collection_id":5,
  "created_at":"2015-04-13T12:49:42.000Z",
  "updated_at":"2015-04-13T12:49:42.000Z",
  "benefits":[
    {"id":5,"name":"my benefit 2"},
    {"id":55,"name":"my benefit 6"},
    {"id":7,"name":"my benefit 7"}
  ],
  "categories":[
    {"id":3,"name":"category 2"}
  ]
}

现在查询部分:

GET index-3/products/_search
{
  "query": {
    "filtered": {
      "query": {
        "match_all": {}
      },
      "filter": {
        "terms": {
          "benefits.id": [
            5,6,7
          ],
          "execution": "and"
        }
      }
    }
  }
}

产生以下结果:

{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 1,
      "hits": [
         {
            "_index": "index-3",
            "_type": "products",
            "_id": "5",
            "_score": 1,
            "_source": {
               "name": "product name 5",
               "family_id": 16,
               "collection_id": 6,
               "created_at": "2015-04-13T12:49:42.000Z",
               "updated_at": "2015-04-13T12:49:42.000Z",
               "benefits": [
                  {
                     "id": 5,
                     "name": "my benefit 2"
                  },
                  {
                     "id": 6,
                     "name": "my benefit 6"
                  },
                  {
                     "id": 7,
                     "name": "my benefit 7"
                  }
               ],
               "categories": [
                  {
                     "id": 3,
                     "name": "category 2"
                  }
               ]
            }
         }
      ]
   }
}

在查询时,我们必须使用带有“和执行”的术语过滤器,以便它只检索包含所有术语的文档。