同义词打破别名数据类型

时间:2019-04-19 22:01:17

标签: elasticsearch

在索引中同时使用别名数据类型和同义词时,似乎包含同义词会破坏别名字段的功能。

要重新创建问题:

# Create the index
PUT /alias.synonyms
{
  "settings": {
    "analysis": {
      "analyzer": {
        "default": {
          "tokenizer": "standard",
          "filter": [
            "my_synonyms"
          ]
        }
      },
      "filter": {
        "my_synonyms": {
          "type": "synonym",
          "synonyms": [
            "big,large"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "my_target": {
        "type": "text"
      },
      "my_alias": {
        "type": "alias",
        "path": "my_target"
      }
    }
  }
}

# Add a document
POST /alias.synonyms/_doc/1
{"my_target":"this sentence has big in it"}
GET /alias.synonyms/_doc/1

# (Search #1) Search for existing synonym word in text field (gets hit)
GET /alias.synonyms/_search
{
  "explain": true,
  "query": {
    "match": {
      "my_target": "big"
    }
  }
}

# (Search #2) Search for non-existing synonym word in text field (gets hit)
GET /alias.synonyms/_search
{
  "explain": true,
  "query": {
    "match": {
      "my_target": "large"
    }
  }
}

# (Search #3) Search for existing non-synonym word in text field (gets hit)
GET /alias.synonyms/_search
{
  "explain": true,
  "query": {
    "match": {
      "my_target": "sentence"
    }
  }
}

# (Search #4) Search for existing synonym word in alias field (no hit, but one was expected)
GET /alias.synonyms/_search
{
  "explain": true,
  "query": {
    "match": {
      "my_alias": "big"
    }
  }
}

# (Search #5) Search for non-existing synonym word in alias field (no hit, but one was expected)
GET /alias.synonyms/_search
{
  "explain": true,
  "query": {
    "match": {
      "my_alias": "large"
    }
  }
}

# (Search #6) Search for existing non-synonym word in alias field (gets hit)
GET /alias.synonyms/_search
{
  "explain": true,
  "query": {
    "match": {
      "my_alias": "sentence"
    }
  }
}

对我来说奇怪的是,不仅仅是不起作用的领域中不存在同义词(搜索5);即使单词已在文档中明确显示(搜索#4),也无法在别名字段上搜索“大”。

运行与上面相同的命令,但在索引设置中保留同义词过滤器,则返回在两个字段中搜索“句子”(搜索#3和#6)和“大”(搜索#1和#4)的结果,而不是预期的“大”(搜索#2和#5)。

上述设置是否有问题?还是Elasticsearch不足以使用同义词分析器来处理别名数据类型的查询?

当前使用的是Elasticsearch 7.0。

0 个答案:

没有答案