在Elasticsearch中对JSON数据执行搜索

时间:2015-06-12 04:05:30

标签: json elasticsearch logstash

我已经通过Logstash将JSON数据映射到Elasticsearch,它已经有效,它已导入数据,我可以在Elasticsearch-Head中看到它。

我的问题是查询数据。我可以搜索字段,但它会将索引中的整个类型作为单个搜索结果返回。我尝试了一些变化,但没有运气。

以下是logstash发货人文件:

input {
   exec {
     type => "recom_db"
     command => "curl -s -X GET http://www.test.com/api/edselastic/recom_db.json"
     interval => 86400
     codec => "json"
   }
   exec {
     type => "recom_ki"
     command => "curl -s -X GET http://www.test.com/api/edselastic/recom_ki.json"
     interval => 86400
     codec => "json"
   }
   exec {
     type => "recom_un"
     command => "curl -s -X GET http://www.test.com/api/edselastic/recom_un.json"
     interval => 86400
     codec => "json"
   }
}
output {
        elasticsearch {
                host => localhost
                index => "lib-recommender-%{+yyyy.MM.dd}"
                template_name => "recommender-template"
        }
}

并且Elasticsearch索引采用以下形式:

{
    "_index": "lib-recommender-2015.06.11",
    "_type": "recom_un",
    "_id": "qoZE4aF-SkS--tq_8MhH4A",
    "_version": 1,
    "_score": 1,
    "_source": {
        "item": [{
            "name": "AAM219 -- reading lists",
            "link": "http://www.test.com/modules/aam219.html",
            "description": "AAM219 -- reading lists",
            "terms": {
                "term": ["AAM219"]
            }
        },
        {
            "name": "AAR410 -- reading lists",
            "link": "http://www.test.com/modules/aar410.html",
            "description": "AAR410 -- reading lists",
            "terms": {
                "term": ["AAR410"]
            }
        }
        ...

无论如何,所以我尝试使用Elasticsearch文档中看到的各种方式查询数据,但无法获得所需的结果。这是我尝试过的众多查询之一:

curl -XPOST "http://localhost:9200/lib-recommender/recom_un/_search" -d'
{
    "fields": ["item.name", "item.link"],
    "query":{
        "term": {
                "item.terms.term": "AAM219"
                        }
                }
        }
}'

但它返回索引中的整个类型(选择了正确的字段但是脱节了所有字段):

{
    "took": 13,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 1,
        "max_score": 0.006780553,
        "hits": [{
            "_index": "lib-recommender-2015.06.11",
            "_type": "recom_un",
            "_id": "qoZE4aF-SkS--tq_8MhH4A",
            "_score": 0.006780553,
            "fields": {
                "item.link": ["http://www.test.com/modules/aam219.html",
                "http://www.test.com/modules/aar410.html",
                "http://www.test.com/modules/ac1201.html",
                "http://www.test.com/modules/aca401.html",

我追随以下结果:

{
    "took": 5,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
    },
    "hits": {
        "total": 1,
        "max_score": 0.006780553,
        "hits": [{
            "_index": "lib-recommender-2015.06.11",
            "_type": "recom_un",
            "_id": "qoZE4aF-SkS--tq_8MhH4A",
            "_score": 0.006780553,
            "_source": {
                "item": [{
                    "name": "AAM219 -- reading lists",
                    "link": "http://www.test.com/modules/aam219.html",
                    "description": "AAM219 -- reading lists",
                    "terms": {
                        "term": ["AAM219"]
                    }
                }
            }
        }
    }
}

我错过了什么?这种搜索的索引映射是否错误(因此我应该在导入数据之前手动为elasticsearch创建映射文件)。查询中是否缺少参数?我一直在寻找答案,但感觉我现在在圈子里跑来跑去,我猜这很简单,我忽略了但不确定。

2 个答案:

答案 0 :(得分:2)

是的,要使用此类用例,您需要创建自定义映射,并确保NUMBER(1)结构的类型为nested,否则item中的所有字段都将为如你在所展示的结果中看到的那样折叠在一起。

所以映射必须是这样的:

item

然后你可以稍微修改你的查询以使用nested query而不是像这样。另请注意我包含inner_hits,因此您的结果只包含匹配的嵌套文档:

{
  "recom_un": {
    "properties": {
      "item": {
        "type": "nested",
        "properties": {
          "name": {
            "type": "string"
          },
          "link": {
            "type": "string"
          },
          "description": {
            "type": "string"
          },
          "terms": {
            "properties": {
              "term": {
                "type": "string"
              }
            }
          }
        }
      }
    }
  }
}

答案 1 :(得分:1)

支持上述Val的回答。它主要是什么,但与另一层嵌套。 这是映射:

{
  "recom_un": {
    "properties": {
      "item": {
        "type": "nested",
        "properties": {
          "name": {
            "type": "string"
          },
          "link": {
            "type": "string"
          },
          "description": {
            "type": "string"
          },
          "terms": {
            "type": "nested",
            "properties": {
              "term": {
                "type": "string"
              }
            }
          }
        }
      }
    }
  }
}

我过去常常得到的搜索查询:

curl -XPOST "http://localhost:9200/lib-recommender/recom_un/_search" -d'
{
  "_source": false,
  "query": {
    "filtered": {
      "filter": {
        "nested": {
          "path": "item",
          "query": {
            "nested": {
              "path": "item.terms",
              "query": {
                "match": {
                  "term": "AAM219"
                }
              }
            }
          },
          "inner_hits": { }
        }
      }
    }
  }
}'