Question

我有一个索引，其中包含属于特定系列的不同书籍的全文。每个文档代表一个系列中的不同卷，每个卷上都有一组嵌套文档，对应于该书中的一段文本。这是我们正在使用的查询，以获得与给定系列的所有书籍中的特定短语匹配的突出显示：

{
  "from": 0,
  "size": 3,
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "query": {
              "bool": {
                "must": [
                  {
                    "match": {
                      "sections.content.phrase": {
                        "query": "theory legal",
                        "type": "phrase",
                        "slop": X
                      }
                    }
                  }
                ]
              }
            },
            "path": "sections",
            "inner_hits": {
              "highlight": {
                "order": "score",
                "fields": {
                  "sections.content.phrase": {}
                }
              },
              "_source": {
                "include": [
                  "title",
                  "id"
                ]
              }
            }
          }
        }
      ],
      "filter": [
        {
          "term": {
            "series": "00410"
          }
        }
      ]
    }
  }
}

通常这个查询工作正常，但对于某些系列，我们可以在没有突出显示文本的书籍中获得点击。例如，使用上面的短语查询，系列和slop值为1，我们正确地获得了系列中一本书的单击：(each allegation of discrimination or each <em>theory</em> of <em>legal</em> recovery not required to be set forth in separate。如果我们采用相同的查询并将斜率值提高到3，我们会突然在5本不同的书籍中获得点击，每本书都没有找到匹配的高亮点。甚至没有返回当slop值为1时的原始命中。为什么我们得到这些结果？

“match_phrase”命中没有返回任何高光

0 个答案: