如何从Elasticsearch中的嵌套数据类型查询中获取所有结果?

时间:2019-06-24 22:23:50

标签: elasticsearch nested lucene

我正在尝试使用Elasticsearch的嵌套数据类型从文本块中获取所有匹配的句子。在下面提到的查询中,我尝试过滤同一句子中提到的所有带有“ the”和“ of”的句子。

GET myindex/doc/_search
{
  "from": 0,
  "size": 1,
  "query": {
    "nested": {
      "path": "parent.data",
      "query": {
        "bool": {
          "filter": [
            {
              "match": {
                "parent.data.sentence": "the"
              }
            },
            {
              "match": {
                "parent.data.sentence": "of"
              }
            }
          ]
        }
      },
      "inner_hits": {}
    }
  }
}

尽管在下面添加的答复显示总共有544个文档,但是ES仅显示其中三个。如何获取所有这些信息?

{
  "took": 709,
  "timed_out": false,
  "_shards": {
    "total": 6,
    "successful": 6,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 73783,
    "max_score": 0,
    "hits": [
      {
        "_index": "my-index",
        "_type": "doc",
        "_id": "bd9e3c03741956db68fd692a6914e811b0749baaf6565c6385380919f1ce3932",
        "_score": 0,
        "_source": {},
        "inner_hits": {
          "parent.data.sentence": {
            "hits": {
              "total": 544,
              "max_score": 0,
              "hits": [<response containing 3 sentence>],
          }
        }
      }
    ]
  }
}

1 个答案:

答案 0 :(得分:1)

您可以使用fromsize options来获得所需的内部点击数。默认情况下,内部匹配的size为3。要将其增加到20,您可以更新查询,如下所示:

GET myindex/doc/_search
{
  "from": 0,
  "size": 1,
  "query": {
    "nested": {
      "path": "parent.data",
      "query": {
        "bool": {
          "filter": [
            {
              "match": {
                "parent.data.sentence": "the"
              }
            },
            {
              "match": {
                "parent.data.sentence": "of"
              }
            }
          ]
        }
      },
      "inner_hits": {
        "size": 20
      }
    }
  }
}

如果不是要一次提取所有记录,我建议您使用fromsize的组合。