在elasticsearch结果数据中排除_id和_index字段

时间:2014-05-31 08:54:44

标签: elasticsearch full-text-search

如果只是点击api,每个文档中有5个字段。但我只想要这两个字段(user_id和loc_code)所以我在字段列表中提到过。但它仍会返回一些不必要的数据,如_shards,hits,time_out等。

使用以下查询在chrome中的postman插件中发出POST请求

<:9200>/myindex/mytype/_search
{
    "fields" : ["user_id", "loc_code"],
    "query":{"term":{"group_id":"1sd323s"}}
}   

//输出

 {
        "took": 17,
        "timed_out": false,
        "_shards": {
            "total": 5,
            "successful": 5,
            "failed": 0
        },
        "hits": {
            "total": 323,
            "max_score": 8.402096,
            "hits": [
                {
                    "_index": "myindex",
                    "_type": "mytype",
                    "_id": "<someid>",
                    "_score": 8.402096,
                    "fields": {
                        "user_id": [
                            "<someuserid>"
                        ],
                        "loc_code": [
                            768
                        ]
                    }
                },
               ...
            ]
        }
    }

但我只想要文档字段(两个提到的字段),我不想要_id,_index,_type。有没有办法这样做

2 个答案:

答案 0 :(得分:0)

对于用户filter_path来说,可能还不完整但有很大帮助的解决方案。例如,假设索引中包含以下内容:

PUT foods/_doc/_bulk
{ "index" : { "_id" : "1" } }
{ "name" : "chocolate cake", "calories": "too much" }
{ "index" : { "_id" : "2" } }
{ "name" : "lemon pie", "calories": "a lot!"  }
{ "index" : { "_id" : "3" } }
{ "name" : "pizza", "calories": "oh boy..."  }

这样的搜索...

GET foods/_search
{
  "query": {
    "match_all": {}
  }
}

...将产生大量元数据:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "foods",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "name" : "lemon pie",
          "calories" : "a lot!"
        }
      },
      {
        "_index" : "foods",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "chocolate cake",
          "calories" : "too much"
        }
      },
      {
        "_index" : "foods",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "name" : "pizza",
          "calories" : "oh boy..."
        }
      }
    ]
  }
}

但是,如果我们为搜索URL提供参数filter_path=hits.hits._score ...

GET foods/_search?filter_path=hits.hits._source
{
  "query": {
    "match_all": {}
  }
}

...它只会返回源(尽管仍然嵌套很深):

{
  "hits" : {
    "hits" : [
      {
        "_source" : {
          "name" : "lemon pie",
          "calories" : "a lot!"
        }
      },
      {
        "_source" : {
          "name" : "chocolate cake",
          "calories" : "too much"
        }
      },
      {
        "_source" : {
          "name" : "pizza",
          "calories" : "oh boy..."
        }
      }
    ]
  }
}

您甚至可以过滤字段:

GET foods/_search?filter_path=hits.hits._source
{
  "query": {
    "match_all": {}
  }
}

...,您会得到这个:

{
  "hits" : {
    "hits" : [
      {
        "_source" : {
          "name" : "lemon pie"
        }
      },
      {
        "_source" : {
          "name" : "chocolate cake"
        }
      },
      {
        "_source" : {
          "name" : "pizza"
        }
      }
    ]
  }
}

如果愿意,您可以做更多的事情,只需检查documentation

答案 1 :(得分:-2)

您可以使用GET api代替。尝试使用类似的东西:

/myindex/mytype/<objectId>/_source

在你的结果中,你只会得到_source。

请参阅:http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-get.html

嗯,这假设您知道文档的ID。我不确定在使用搜索API时是否可以排除元数据。

也许: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-source-filtering.html