Elasticseach不使用同义词文件

时间:2017-11-02 10:05:06

标签: elasticsearch

我是弹性搜索的新手,所以在downvoting或标记为重复之前,请先阅读问题。

我正在测试我在Ubuntu 16.04上安装的elasticsearch(v 2.4.6)中的同义词。我通过名为 synonym.txt 的文件给出了同义词,我已将其放在 config 目录中。我创建了一个索引 synonym_test ,如下所示 -

curl -XPOST localhost:9200/synonym_test/ -d '{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_synonyms": {
          "tokenizer": "whitespace",
            "filter": ["lowercase","my_synonym_filter"] 
            }
         },
      "filter": {
        "my_synonym_filter": {
          "type": "synonym", 
            "ignore_case": true,
              "synonyms_path" : "synonym.txt"
              }
          }
      }
   }
}'

索引包含两个字段 - id some_text 。我使用自定义分析器配置字段 some_text ,如下所示 -

curl -XPUT localhost:9200/synonym_test/rulers/_mapping -d '{
  "properties": {
    "id": {
      "type": "double"
      },
    "some_text": {
      "type": "string",
          "search_analyzer": "my_synonyms"       
          }
      }
 }'

然后我插入了一些数据 -

curl -XPUT localhost:9200/synonym_test/external/5 -d '{
  "id" : "5",
  "some_text":"apple is a fruit"
}'
curl -XPUT localhost:9200/synonym_test/external/7 -d '{
  "id" : "7",
  "some_text":"english is spoken in england"
}'
curl -XPUT localhost:9200/synonym_test/external/8 -d '{
  "id" : "8",
  "some_text":"Scotland Yard is a popular game."
}'
curl -XPUT localhost:9200/synonym_test/external/9 -d '{
  "id" : "9",
  "some_text":"bananas contain potassium"
}'

synonym.txt文件包含以下内容 -

"britain,england,scotland"
"fruit,bananas"

完成所有这些操作后,当我运行术语 fruit 的查询时(它还应该返回包含香蕉的文本,因为它们是文件中的同义词),我得到的文本只包含水果。

{
  "took":117,
   "timed_out":false,
   "_shards":{  
      "total":5,
      "successful":5,
      "failed":0
   },
   "hits":{  
      "total":1,
      "max_score":0.8465736,
      "hits":[  
         {  
            "_index":"synonym_test",
            "_type":"external",
            "_id":"5",
            "_score":0.8465736,
            "_source":{  
               "id":"5",
               "some_text":"apple is a fruit"
            }
         }
      ]
   }
}

我也试过以下链接,但似乎没有人帮助过我 - Synonym analyzer not workingElasticsearch synonym analyzer not workingHow to apply synonyms at query time instead of index time in Elasticsearchhow to configure the synonyms_path in elasticsearch以及许多其他链接。

那么,任何人都可以告诉我,如果我做错了吗?设置或同义词文件有什么问题吗?我希望同义词能够工作(查询时间),这样当我搜索一个术语时,我会得到与该术语相关的所有文档。

1 个答案:

答案 0 :(得分:0)

请参考以下网址:Custom Analyzer,了解如何配置自定义分析器。 如果我们遵循上述文档中的指南,我们的架构将如下所示:

curl -XPOST localhost:9200/synonym_test/ -d '{
  "settings": {
"analysis": {
  "analyzer": {
    "type": "custom"
    "my_synonyms": {
      "tokenizer": "whitespace",
        "filter": ["lowercase","my_synonym_filter"] 
        }
     },
  "filter": {
    "my_synonym_filter": {
      "type": "synonym", 
        "ignore_case": true,
          "synonyms_path" : "synonym.txt"
          }
      }
  }
  }
}

目前在我的Elasticsearch实例上工作。