Question

我试图弄清楚如何做这样的事情。我们说我有3个文件，结构如下：

{
    "name": "test one",
    "images": [
        {
            "id": 1
        }
    ]
}


{
    "name": "test two",
    "images": []
}


{
    "name": "test three",
    "images": [
        {
            "id": 2
        }
    ]
}

我想在images字段中获取文件 WITH 对象（在本例中为2），或者（不太优选）文档数< images字段中的strong> WITHOUT 对象（在本例中为1）。这是针对其中一个聚合查询，如果不明显的话。我尝试了大约100种不同的聚合类型，包括这个

... 
"withoutPhotos": {
  "nested": {
    "path": "images"
  },
  "aggs": {
    "noPhoto": {
      "missing": {
        "field": "images.id"
      }
    }
  }
}

这，

... 
"withoutPhotos": {
  "missing": {
    "field": "images"
  }
}

和其他人的充实。有什么想法吗？

Answer 1

这是一个返回缺少images.id字段的结果的查询（看起来与你的非常相似）：

curl -XGET 'http://localhost:9200/index/test/_search?search_type=count&pretty' -d '
> { "query" :{
> "match_all": { } 
>    },
>    "aggs": {
>     "noPhoto": { "missing": {"field": "images.id"}  }
>   }
> }'
{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "noPhoto" : {
      "doc_count" : 1
    }
  }
}

这是一个返回实例数量image.id的查询 - 不完全确定这是否是您想要的（它返回的是字段数而不是文档数）。

olly@HomePC:~$ curl -XGET 'http://localhost:9200/index/test/_search?search_type=count&pretty' -d '
> { "query" :{
> "match_all": { } 
>    },
>    "aggs": {
>  "images_count" : { "value_count" : { "field" : "images.id" } }
>   }
> }'
{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "images_count" : {
      "value" : 2
    }
  }
}

其他选项是在查询中添加一些东西来查找＆＃34; images.id＆＃34; - 例如一张通配符。

Answer 2

我会建议这种做法。

curl -XGET 'http://localhost:9200/test/test/_search?search_type=count&pretty' -d '
 { "query" :{
  "match_all": { } 
     },
    "aggs": {
     "noImage": { "missing": {"field": "images.id"}  }
   }
 }'

结果

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "noImage" : {
      "doc_count" : 1
    }
  }
}

这里的hits.total将是该索引中的文档总数。 aggregations.noImage.doc_count是没有图像的文档数。因此，具有图像字段的文档的数量将是hits.total - aggregations.noImage.doc_count

具有image = hits.total - aggregations.noImage.doc_count
文档没有image = aggregations.noImage.doc_count

没有嵌套文档的文档数

2 个答案: