我的服务器上运行了Elasticsearch 1.5,
具体来说,我希望/创建三个字段是
1.name
4.产品说明
3.nickname
当我在Elasticsearch上插入数据然后自动删除不需要的停用词时,我想要设置停用词作为描述和昵称字段。我正在尝试这么多时间,但没有工作。
curl -X POST http://127.0.0.1:9200/tryoindex/ -d'
{
"settings": {
"analysis": {
"filter": {
"custom_english_stemmer": {
"type": "stemmer",
"name": "english"
},
"snowball": {
"type" : "snowball",
"language" : "English"
}
},
"analyzer": {
"custom_lowercase_stemmed": {
"tokenizer": "standard",
"filter": [
"lowercase",
"custom_english_stemmer",
"snowball"
]
}
}
}
},
"mappings": {
"test": {
"_all" : {"enabled" : true},
"properties": {
"text": {
"type": "string",
"analyzer": "custom_lowercase_stemmed"
}
}
}
}
}'
curl -X POST "http://localhost:9200/tryoindex/nama/1" -d '{
"text" : "Tryolabs running monkeys KANGAROOS and jumping elephants jum is your"
}'
curl "http://localhost:9200/tryoindex/nama/_search?pretty=1" -d '{
"query": {
"query_string": {
"query": "Tryolabs running monkeys KANGAROOS and jumping elephants jum is your",
"fields": ["text"]
}
}
}'
答案 0 :(得分:1)
将您的分析仪部件更改为
"analyzer": {
"custom_lowercase_stemmed": {
"tokenizer": "standard",
"filter": [
"stop",
"lowercase",
"custom_english_stemmer",
"snowball"
]
}
}
要验证更改,请使用
curl -XGET 'localhost:9200/tryoindex/_analyze?analyzer=custom_lowercase_stemmed' -d 'testing this is stopword testing'
并观察代币
{"tokens":[{"token":"test","start_offset":0,"end_offset":7,"type":"<ALPHANUM>","position":1},{"token":"stopword","start_offset":16,"end_offset":24,"type":"<ALPHANUM>","position":4},{"token":"test","start_offset":25,"end_offset":32,"type":"<ALPHANUM>","position":5}]}%
PS:如果您不想获得测试的词干版本,请删除词干过滤器。
答案 1 :(得分:0)
您需要在分析器过滤器链中使用stop token filter。