Rails Searchkick has_many索引和搜索

时间:2017-09-13 19:43:41

标签: elasticsearch ruby-on-rails-5 searchkick

我可以通过customer_id,姓名,姓氏和孩子ID,姓名,姓氏和"生日和#34;

进行搜索

按ID搜索必须准确无误。按姓名或姓氏搜索的拼写错误与距离2有效。但我想通过kid_birthdate搜索匹配完全(拼错,距离0)

到目前为止,每当我通过birthdate搜索时,结果都会像拼写错误的距离2一样返回。我不知道如何搜索确切的日期。

Rails 5.1.0.rc1

elasticsearch-5.0.3

searchkick-2.2.0

class Customer < ActiveRecord::Base
  include Searchable

  def search_data
    attributes.merge avatar_url: avatar.url, kids: kids
  end

  has_many :kids
  ...
end

class Kid < ActiveRecord::Base
    belongs_to :customer

    def reindex_customer
        customer.reindex async: true
    end 
    ...
end      

module Searchable
  extend ActiveSupport::Concern

  included do
    SEARCH_RESULTS_PER_PAGE = 10

    def self.elastic_search(query, opts = { page: 1 })
      # This regex accept string that contains digits or dates
      regexp = /(\d+)|(^(0[1-9]|1\d|2\d|3[01])-(0[1-9]|1[0-2])-(19|20)\d{2}$)/
      distance = query.match?(regexp) ? 0 : 2 #This is for calculate the distance for misspelling 0 for digits and dates and 2 for strings
      options = { load: false,
                  match: :word_middle,
                  misspellings: { edit_distance: distance },
                  per_page: SEARCH_RESULTS_PER_PAGE,
                  page: opts[:page] }
      search query, options
    end
  end
end

我的索引包含她/他孩子数据的客户数据。孩子们嵌套在她/他的父母顾客之下。 如何强制搜索日期的精确匹配

对于此查询:

curl http://localhost:9200/customers_development/_search?pretty -d '{"query":{"dis_max":{"queries":[{"match":{"_all":{"query":"28388","boost":10,"operator":"and","analyzer":"searchkick_search"}}},{"match":{"_all":{"query":"28388","boost":10,"operator":"and","analyzer":"searchkick_search2"}}},{"match":{"_all":{"query":"28388","boost":1,"operator":"and","analyzer":"searchkick_search","fuzziness":0,"prefix_length":0,"max_expansions":3,"fuzzy_transpositions":true}}},{"match":{"_all":{"query":"28388","boost":1,"operator":"and","analyzer":"searchkick_search2","fuzziness":0,"prefix_length":0,"max_expansions":3,"fuzzy_transpositions":true}}}]}},"size":10,"from":0,"timeout":"11s"}'

这是索引的外观:

{
  "took": 11,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 97.29381,
    "hits": [
      {
        "_index": "customers_development_20170913145033808",
        "_type": "customer",
        "_id": "28388",
        "_score": 97.29381,
        "_source": {
          "id": 28388,
          "created_at": "2017-07-10T19:49:43.856Z",
          "updated_at": "2017-09-13T03:01:51.727Z",
          "name": "Linda",
          "lastname": "Schott",
          "email": "linda.schott@web.de",
          "avatar": null,
          "phone": null,
          "mobile": null,
          "erster_kontakt": null,
          "memo": null,
          "brief_title": null,
          "newsletter": null,
          "avatar_url": "/no_customer.png",
          "kids": [
            {
              "id": 34229,
              "name": "Jakob",
              "lastname": "Schott",
              "birthdate": "2013-03-22",
              "age": "4,5",
              "avatar": {
                "url": "/avatars/kid/34229/Jellyfish.png",
                "thumb": {
                  "url": "/avatars/kid/34229/thumb_Jellyfish.png"
                }
              },
              "created_at": "2017-07-10T19:50:16.058Z",
              "updated_at": "2017-09-13T03:02:52.962Z",
              "customer_id": 28388,
              "member": null,
              "year_certified": null,
              "zahlart": null,
              "tn_merge_markiert": null,
              "family": null,
              "medal": "black",
              "score": 30,
              "current_level": "swimmys"
            },
            {
              "id": 34228,
              "name": "Lilith",
              "lastname": "Schott",
              "birthdate": "2013-03-22",
              "age": "4,5",
              "avatar": {
                "url": "/avatars/kid/34228/Penguins.png",
                "thumb": {
                  "url": "/avatars/kid/34228/thumb_Penguins.png"
                }
              },
              "created_at": "2017-07-10T19:50:16.058Z",
              "updated_at": "2017-09-13T03:02:52.962Z",
              "customer_id": 28388,
              "member": null,
              "year_certified": null,
              "zahlart": null,
              "tn_merge_markiert": null,
              "family": null,
              "medal": "green",
              "score": 17,
              "current_level": "beginner"
            },
            {
              "id": 27718,
              "name": "Johanna",
              "lastname": "Plischke",
              "birthdate": "2010-12-29",
              "age": "6,8",
              "avatar": {
                "url": "/avatars/kid/27718/Koala.png",
                "thumb": {
                  "url": "/avatars/kid/27718/thumb_Koala.png"
                }
              },
              "created_at": "2017-07-10T19:50:16.034Z",
              "updated_at": "2017-09-13T04:01:15.261Z",
              "customer_id": 28388,
              "member": null,
              "year_certified": null,
              "zahlart": null,
              "tn_merge_markiert": null,
              "family": null,
              "medal": "red",
              "score": 27,
              "current_level": ""
            }
          ]
        }
      }
    ]
  }
}

1 个答案:

答案 0 :(得分:0)

让我们分析一下查询的部分:

"match":{
    "_all":{
        "query":"28388",
        "boost":1,
        "operator":"and",
        "analyzer":"searchkick_search",
        "fuzziness":0,
        "prefix_length":0,
        "max_expansions":3,
        "fuzzy_transpositions":true
    }
}

_all

您已说过kids是嵌套字段,但您只是搜索_all,因此我们要首先明确的是kids是否包含在_all

正如document所说:

  

为嵌套对象中的所有属性设置默认include_in_all值。嵌套文档没有自己的_all字段。而是将值添加到主“根”文档的_all字段中。

因此,第一个问题是索引嵌套类型是否已将include_in_all设置为false,这使得嵌套字段无法按_all进行搜索。

嵌套查询

或者您可以选择嵌套查询来查询嵌套对象:

GET /_search
{
    "query": {
        "nested" : {
            "path" : "kids",
            "score_mode" : "avg",
            "query" : {
                "query_string": {
                  "fields": ["kids.birthdate"],
                  "query": "xxx"
                } 
            }
        }
    }
}

模糊

当涉及拼写错误时,Elasticsearch建议我们使用模糊查询:

GET /_search
{
    "query": {
        "fuzzy" : {
            "name" : {
                "value" :         "xxx",
                 "boost" :         1.0,
                 "fuzziness" :     2,
                 "prefix_length" : 0,
                 "max_expansions": 100
            }
        }
    }
}

组合查询

最后,我们可以使用bool查询组合它们:

POST _search
{
  "query": {
    "bool" : {
      "must" : [{
            "nested" : {
                "path" : "kids",
                "query" : {
                    "query_string": {
                      "fields": ["kids.birthdate"],
                      "query": "xxx"
                    } 
                }
            }            
       },
        {  "fuzzy" : {
                "name" : {
                    "value" :         "xxx",
                     "boost" :         1.0,
                     "fuzziness" :     2,
                     "prefix_length" : 0,
                     "max_expansions": 100
                }
           }
       }]
    }
  }
}

我不熟悉Ruby,所以我可以提供帮助。希望有帮助。