添加额外的停止词elasticsearch

时间:2014-01-15 08:15:31

标签: elasticsearch stop-words

目的

删除阻止词出现在术语构面

环境&设置

Mac OSX,   ES 0.90.7通过自制软件安装

步骤

更新配置

# /usr/local/Cellar/elasticsearch/0.90.7/config/elasticsearch.yml
# add more Stopwords to default standard analyzer
index:
analysis:
  analyzer:
    standard:
      type: standard
      stopwords: [http, t.co]

重启ES

curl -XGET 'localhost:9200/_analyze?analyzer=standard&pretty' -d 'this is a test http'

结果是

{
  "tokens": [
    {
      "token": "test",
      "start_offset": 10,
      "end_offset": 14,
      "type": "<ALPHANUM>",
      "position": 4
    },
    {
      "token": "http",
      "start_offset": 15,
      "end_offset": 19,
      "type": "<ALPHANUM>",
      "position": 5
    }
  ]
}

期望

http不应该被编入索引,也不应出现在令牌

1 个答案:

答案 0 :(得分:2)

您无需使用分析器配置来排除术语构面中的单词。在请求条款方面时,您可以为exclude参数提供要排除的单词列表:

"facets" : {
    "body" : {
        "terms" : {
            "field" : "body",
            "exclude" : ["http". "t.co"]
        }
    }
}

有关详细信息,请参阅terms facet documentation