当子元素是字符串数组时,搜索elasticsearch的查询

时间:2014-12-10 12:12:23

标签: elasticsearch

我使用以下格式在elasticsearch中创建了一个文档

curl -XPUT "http://localhost:9200/my_base.main_candidate/" -d'
{
    "specific_location": {
        "location_name": "Mumbai",
        "location_tags": [
            "Mumbai"
        ],
        "tags": [
            "Mumbai"
        ]
    }
}'

我的要求是搜索包含给定选项之一的location_tags,例如[" Mumbai"," Pune"]。我该怎么做?

我试过了:

curl -XGET "http://localhost:9200/my_base.main_candidate/_search" -d '
{
  "query": {
    "filtered": {
      "query": {
        "match_all": {}
      },
      "filter": {
        "terms": {
          "specific_location.location_tags" : ["Mumbai"]
        }
      }
    }
  }
}'

哪个不起作用。

我得到了这个输出:

{
  "took": 72,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

1 个答案:

答案 0 :(得分:3)

有几种方法可以解决这个问题。也许最直接的方法是搜索mumbai而不是Mumbai

如果我创建没有映射的索引,

curl -XDELETE "http://localhost:9200/my_base.main_candidate/"
curl -XPUT "http://localhost:9200/my_base.main_candidate/"

然后添加一个doc:

curl -XPUT "http://localhost:9200/my_base.main_candidate/doc/1" -d'
{
   "specific_location": {
      "location_name": "Mumbai",
      "location_tags": [
         "Mumbai"
      ],
      "tags": [
         "Mumbai"
      ]
   }
}'

然后使用小写术语

运行查询
curl -XPOST "http://localhost:9200/my_base.main_candidate/_search" -d'
{
   "query": {
      "filtered": {
         "query": {
            "match_all": {}
         },
         "filter": {
            "terms": {
               "specific_location.location_tags": [
                  "mumbai"
               ]
            }
         }
      }
   }
}'

我找回了预期的文件:

{
   "took": 3,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 1,
      "hits": [
         {
            "_index": "my_base.main_candidate",
            "_type": "doc",
            "_id": "1",
            "_score": 1,
            "_source": {
               "specific_location": {
                  "location_name": "Mumbai",
                  "location_tags": [
                     "Mumbai"
                  ],
                  "tags": [
                     "Mumbai"
                  ]
               }
            }
         }
      ]
   }
}

这是因为,由于没有使用显式映射,Elasticsearch使用默认值,这意味着location_tags字段将使用standard analyzer进行分析,这将把术语转换为小写。因此,术语Mumbai不存在,但mumbai确实存在。

如果您希望能够在查询中使用大写术语,则需要设置一个显式映射,告知Elasticsearch不要分析location_tags字段。也许是这样的:

curl -XDELETE "http://localhost:9200/my_base.main_candidate/"

curl -XPUT "http://localhost:9200/my_base.main_candidate/" -d'
{
   "mappings": {
      "doc": {
         "properties": {
            "specific_location": {
               "properties": {
                  "location_tags": {
                     "type": "string",
                     "index": "not_analyzed"
                  },
                  "tags": {
                     "type": "string",
                     "index": "not_analyzed"
                  }
               }
            }
         }
      }
   }
}'

curl -XPUT "http://localhost:9200/my_base.main_candidate/doc/1" -d'
{
   "specific_location": {
      "location_name": "Mumbai",
      "location_tags": [
         "Mumbai"
      ],
      "tags": [
         "Mumbai"
      ]
   }
}'

curl -XPOST "http://localhost:9200/my_base.main_candidate/_search" -d'
{
   "query": {
      "filtered": {
         "query": {
            "match_all": {}
         },
         "filter": {
            "terms": {
               "specific_location.location_tags": [
                  "Mumbai"
               ]
            }
         }
      }
   }
}'

以上所有代码都在一个方便的地方:

http://sense.qbox.io/gist/74844f4d779f7c2b94a9ab65fd76eb0ffe294cbb

[编辑:顺便说一下,我在测试上面的代码时使用了Elasticsearch 1.3.4]