Question

我使用Elastic Search来获取类似的文档。我的文档存在于索引中，其中包含以下映射：

"mappings": {
    "document": {
      "properties": {
        "content": {
          "type": "text",
          "fielddata": true
        }
      }
    }
  }

现在搜索类似的文档，我使用的更像是这个查询：

{
  "query": {
    "bool": {
      "must": [
        {
          "more_like_this": {
            "fields": [
              "content"
            ],
            "like": [
              "Lorem ipsum dolor sit amet"
            ],
            "min_doc_freq": 1,
            "min_term_freq": 1
          }
        }
      ]
    }
  }
}

现在，我想根据与搜索到的内容匹配的令牌数量对过滤后的文档进行评分。

EG。上面的搜索词产生5个令牌（Lorem，ipsum，dolor，sit，amet）。如果我检索到的文档中包含4个上述标记，则分数应为4/5

为了实现上述逻辑（几乎没有增强），我尝试使用自定义的AbstractDoubleSearchScript实现实现我自己的Native Script。

我能够使用doc（）函数访问检索到的文档的标记，但我无法弄清楚如何访问：

1）从查询词生成的标记（Lorem，ipsum，dolor，sit，amet）

2）与该令牌相关的分数（基于TF / IDF或使用自定义相似性模块生成的分数）

    @Override
    public ExecutableScript newScript(Map<String, Object> params) {

        return new AbstractDoubleSearchScript() {

            @Override
            public double runAsDouble() {

                // get doc ( tokens in the indexed document )
                ScriptDocValues doc = (ScriptDocValues) doc().get("content");

                // prints doc values : [ipsum, sit]
                System.out.println("doc values : " + doc);

                // get tokens in the query term

                // Double score = size(document tokens) / size(query tokens)

                return score;
            }
        };
    }

在编写自定义插件时，在线提供的文档几乎没有帮助。有人可以告诉我，如果我在正确的位置实现我的逻辑，以及可以做些什么来访问所需的数据？

ElasticSearch Native Script：访问Search Query Tokens

0 个答案: