使用_count无法在患者数据中获得正确的癌症计数

时间:2019-07-24 15:17:07

标签: elasticsearch

我需要获得患者罹患癌症的总数,但是我只需要获得癌症的患者人数,而不是癌症的总数。

我已经尝试了一些答案,但是我不太清楚它是如何工作的:

为了进行测试,我试图获得一个已知患有两种癌症的患者的癌症数,而我只得到了1,而不是2。

#
# A specific patient who have 2 tumor
#
GET test/_count
{
  "query": {
    "match": {
      "ipp": "identifier"
    } 
  }
}


#
# Count how many cancer this patient had
#
GET test/_count
{
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "path": "inferedCancers.tumorEvolutions",
            "query": {
              "match": {
                "inferedCancers.tumorEvolutions.typeTumorEvolution": "INITIAL"
              }
            }
          }
        },
        {
          "match": {
            "ipp": "identifier"
          }
        }
      ]
    }
  }
}

输出:

{
  "count" : 1,
  "_shards" : {
    "total" : 4,
    "successful" : 4,
    "skipped" : 0,
    "failed" : 0
  }
}

# count should be 2
{
  "count" : 1,
  "_shards" : {
    "total" : 4,
    "successful" : 4,
    "skipped" : 0,
    "failed" : 0
  }
}

编辑: 使用Inner hitsHow to count of nested inner_hits documents? 我找到了解决方案,但使用了_search

GET test/_search
{
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "path": "inferedCancers.tumorEvolutions",
            "query": {
              "match": {
                "inferedCancers.tumorEvolutions.typeTumorEvolution": "INITIAL"
              }
            }
          }
        },
        {
          "match": {
            "ipp": "identifier"
          }
        }
      ]
    }
  },
  "aggs": {
    "initials": {
      "nested": {
        "path": "inferedCancers.tumorEvolutions"
      }
    }
  }
}

输出:

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 4,
    "successful" : 4,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "initials" : {
      "doc_count" : 2
    }
  }
}

是否可以通过_count完成?

0 个答案:

没有答案