弹性搜索

时间:2016-11-02 01:57:38

标签: elasticsearch

我在弹性搜索中设置了以下记录

POST /books/book/1
{
  "title" : "JavaScript: The Good Parts",
  "author" : "Douglas Crockford",
  "language" : "JavaScript",
  "publishYear" : 2009,
  "soldCopy" : "50"
}

POST /books/book/2
{
  "title" : "JavaScript: The Good Parts",
  "author" : "Douglas Crockford",
  "language" : "JavaScript",
  "publishYear" : 2009,
  "soldCopy" : "110"
}

POST /books/book/3
{
  "title" : "JavaScript: The Good Parts",
  "author" : "Douglas Crockford1",
  "language" : "JavaScript",
  "publishYear" : 2011,
  "soldCopy" : "2"
}

POST /books/book/4
{
  "title" : "JavaScript: The Good Parts",
  "author" : "Douglas Crockford2",
  "language" : "JavaScript",
  "publishYear" : 2012,
  "soldCopy" : "5"
}

我使用以下弹性搜索查询来获取基于给定年份2009的不同标题和作者。我期待的查询的输出是

JavaScript: The Good Parts Douglas Crockford

但是在回复中我得到了2条记录,其输出相同:

JavaScript: The Good Parts      Douglas Crockford
JavaScript: The Good Parts      Douglas Crockford

用于弹性搜索的查询是:

{
  "query": {
    "match": {
      "publishYear": "2009"   }
  }
}

我尝试在数据库术语中创建的等效选择查询是:

select distinct title,author from book where publishYear = '2009'

我怎么能从弹性搜索获得与我的sql查询相同的输出? 感谢

1 个答案:

答案 0 :(得分:0)

sql中的区别相当于elasticsearch中的terms aggregation

{
  "query": {
    "match": {
      "publishYear": "2009"
    }
  },
  "aggs": {
    "unique_author": {
      "terms": {
        "field": "author",
        "size": 10
      }
    },
    "unique_book": {
      "terms": {
        "field": "title",
        "size": 10
      }
    }
  },
  "size": 0
}

要实现此功能,您必须将标题和作者字段设为 not_analyzed ,或者您也可以将keyword tokenizerlowercase令牌过滤器结合使用。更好的选择是让它们成为multi fields

您可以像这样创建索引

PUT books
{
  "mappings": {
    "book":{
      "properties": {
        "title":{
          "type": "string",
          "fields": {
            "raw":{
              "type": "string",
              "index": "not_analyzed"
            }
          }
        },
        "author":{
          "type": "string",
          "fields": {
            "raw":{
              "type": "string",
              "index": "not_analyzed"
            }
          }
        },
        "language":{
          "type": "string"
        },
        "publishYear":{
          "type": "integer"
        },
        "soldCopy":{
          "type": "string"
        }
      }
    }
  }
}

然后在聚合中使用 .raw