同义词令牌过滤器

时间:2020-04-01 09:09:39

标签: elasticsearch

我创建了带有同义词标记过滤器的测试索引

 PUT /synonyms-index
{

"settings": {

"analysis": {

"filter": {

"my_synonym_filter": {

"type": "synonym",

"synonyms": [

"shares","equity","stock"

]

}

},

"analyzer": {

"my_synonyms": {

"tokenizer": "standard",

"filter": [

"lowercase",

"my_synonym_filter"

]

}

}

}

}

}

然后我运行了API分析,

post synonyms-index/_analyze
{
"analyzer":"my_synonyms",
"text":"equity awesome"
}

我收到以下答复,以查看什么代币进入了反向索引,并且我期望根据同义词规则需要添加“股票”和“股票”,但事实并非如此。我在这里想念什么吗?

{
  "tokens": [
    {
      "token": "equity",
      "start_offset": 0,
      "end_offset": 6,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "awesome",
      "start_offset": 7,
      "end_offset": 14,
      "type": "<ALPHANUM>",
      "position": 1
    }
  ]
}

1 个答案:

答案 0 :(得分:0)

发布社区答案-

这是JSON的常见陷阱,

我们需要这样(将所有内容都用双引号引起来,这构成一个规则,并且遵循简单的扩展即可)。

"synonyms": [ "shares,equity,stock" ]

而不是

"synonyms": [

"shares","equity","stock"

]