我正在尝试使用Elasticsearch的嵌套数据类型从文本块中获取所有匹配的句子。在下面提到的查询中,我尝试过滤同一句子中提到的所有带有“ the”和“ of”的句子。
GET myindex/doc/_search
{
"from": 0,
"size": 1,
"query": {
"nested": {
"path": "parent.data",
"query": {
"bool": {
"filter": [
{
"match": {
"parent.data.sentence": "the"
}
},
{
"match": {
"parent.data.sentence": "of"
}
}
]
}
},
"inner_hits": {}
}
}
}
尽管在下面添加的答复显示总共有544个文档,但是ES仅显示其中三个。如何获取所有这些信息?
{
"took": 709,
"timed_out": false,
"_shards": {
"total": 6,
"successful": 6,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 73783,
"max_score": 0,
"hits": [
{
"_index": "my-index",
"_type": "doc",
"_id": "bd9e3c03741956db68fd692a6914e811b0749baaf6565c6385380919f1ce3932",
"_score": 0,
"_source": {},
"inner_hits": {
"parent.data.sentence": {
"hits": {
"total": 544,
"max_score": 0,
"hits": [<response containing 3 sentence>],
}
}
}
]
}
}
答案 0 :(得分:1)
您可以使用from
和size
options来获得所需的内部点击数。默认情况下,内部匹配的size
为3。要将其增加到20,您可以更新查询,如下所示:
GET myindex/doc/_search
{
"from": 0,
"size": 1,
"query": {
"nested": {
"path": "parent.data",
"query": {
"bool": {
"filter": [
{
"match": {
"parent.data.sentence": "the"
}
},
{
"match": {
"parent.data.sentence": "of"
}
}
]
}
},
"inner_hits": {
"size": 20
}
}
}
}
如果不是要一次提取所有记录,我建议您使用from
和size
的组合。