复合Elasticsearch过滤器

时间:2017-02-07 16:52:12

标签: elasticsearch elasticsearch-dsl

我在嵌套布尔查询中看到inner_hits结果中的异常行为。

测试数据(简称简称):

# MAPPING
PUT unit_testing
{
    "mappings": {
        "document": {
            "properties": {
                "display_name": {"type": "text"},
                "metadata": {
                    "properties": {
                        "NAME": {"type": "text"}
                    }
                }
            }
        },
        "paragraph": {
            "_parent": {"type": "document"},
            "_routing": {"required": true},
            "properties": {
                "checksum": {"type": "text"},
                "sentences": {
                    "type": "nested",
                    "properties": {
                        "text": {"type": "text"}
                    }
                }
            }
        }
    }
}

# DOCUMENT X 2 (d0, d1)
PUT unit_testing/document/doc_id_d0
{
    "display_name": "Test Document d0",
    "paragraphs": [
        "para_id_d0p0",
        "para_id_d0p1"
    ],
    "metadata": {"NAME": "Test Document d0 Metadata"}
}

# PARAGRAPH X 2 (d0p0, d1p0)
PUT unit_testing/paragraph/para_id_d0p0?parent=doc_id_d0
{
    "checksum": "para_checksum_d0p0",
    "sentences": [
        {"text": "Test sentence d0p0s0"},
        {"text": "Test sentence d0p0s1 ODD"},
        {"text": "Test sentence d0p0s2 EVEN"},
        {"text": "Test sentence d0p0s3 ODD"},
        {"text": "Test sentence d0p0s4 EVEN"}
    ]
}

此初始查询的行为与我预期的一样(我知道在此示例中实际上不需要元数据过滤器):

GET unit_testing/paragraph/_search
{
    "_source": "false", 
    "query": {
        "bool": {
            "must": [
                {
                    "has_parent": {
                        "query": {
                            "match_phrase": {
                                "metadata.NAME": "Test Document d0 Metadata"
                            }
                        }, 
                        "type": "document"
                    }
                }, 
                {
                    "nested": {
                        "inner_hits": {}, 
                        "path": "sentences", 
                        "query": {
                            "match": {
                                "sentences.text": "d0p0s0"
                            }
                        }
                    }
                }
            ]
        }
    }
}

它产生一个inner_hits对象,其中包含与谓词匹配的一个句子(为清晰起见,删除了一些字段):

{
  "hits": {
    "hits": [
      {
        "_source": {},
        "inner_hits": {
          "sentences": {
            "hits": {
              "hits": [
                {
                  "_source": {
                    "text": "Test sentence d0p0s0"
                  }
                }
              ]
            }
          }
        }
      }
    ]
  }
}

以下查询是尝试将上述查询嵌入父级"应该"子句,在初始查询和匹配单个句子的附加查询之间创建逻辑OR:

GET unit_testing/paragraph/_search
{
    "_source": "false", 
    "query": {
        "bool": {
            "should": [
                {
                    "bool": {
                        "must": [
                            {
                                "has_parent": {
                                    "query": {
                                        "match_phrase": {
                                            "metadata.NAME": "Test Document d0 Metadata"
                                        }
                                    }, 
                                    "type": "document"
                                }
                            }, 
                            {
                                "nested": {
                                    "inner_hits": {}, 
                                    "path": "sentences", 
                                    "query": {
                                        "match": {
                                            "sentences.text": "d0p0s0"
                                        }
                                    }
                                }
                            }
                        ]
                    }
                }, 
                {
                    "nested": {
                        "inner_hits": {}, 
                        "path": "sentences", 
                        "query": {
                            "match": {
                                "sentences.text": "d1p0s0"
                            }
                        }
                    }
                }
            ]
        }
    }
}

而" d1"查询输出一个人们期望的结果,其中inner_hits对象包含匹配的句子,原始" d0"查询现在产生一个空的inner_hits对象:

{
  "hits": {
    "hits": [
      {
        "_source": {},
        "inner_hits": {
          "sentences": {
            "hits": {
              "total": 0,
              "hits": []
            }
          }
        }
      },
      {
        "_source": {},
        "inner_hits": {
          "sentences": {
            "hits": {
              "hits": [
                {
                  "_source": {
                    "text": "Test sentence d1p0s0"
                  }
                }
              ]
            }
          }
        }
      }
    ]
  }
}

虽然我使用elasticsearch_dsl Python库来构建和组合这些查询,并且我对查询DSL有点新手,但查询格式对我来说很可靠。

我错过了什么?

1 个答案:

答案 0 :(得分:1)

我认为缺少的是1 FOO_RELEASE_STORE_FILE=foo-release-key.keystore 2 FOO_RELEASE_KEY_ALIAS=foo.android 3 FOO_RELEASE_STORE_PASSWORD=****************** 4 FOO_RELEASE_KEY_PASSWORD=*************** 的{​​{1}}参数 - 您在两个不同的查询中有两个name子句,这些子句最终会使用相同的名称。尝试提供inner_hits一个inner_hits参数(0)。

0 - https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-inner-hits.html#_options