我想使用这个分析器:跳过每个单词“g”,“l”和你遇到的所有十进制数字。我想使用分析仪,但我不确定是否使用停止分析器是正确的,也不确定如何指定要跳过的这些十进制数。我有这个:
PUT /products
{
"settings": {
"analysis": {
"filter": {
"my_stopwords": {
"type": "stop",
"stopwords": [ "l", "g" ]
}},
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [ "lowercase", "my_stopwords" ]
}}
}}}
如何修复它以便它与十进制数一起使用?
答案 0 :(得分:1)
我再次..我似乎无法将正则表达式添加到停用词。但是,我确实通过添加另一个名为 filter_amount 的过滤器来解决这个问题。这就是它的样子:
"filter_amount": {
"type": "pattern_replace",
"pattern": "[\\d]+([\\.,][\\d]+)?",
"replacement": ""
}
这就是设置应该是这样的:
PUT /products
{
"settings": {
"analysis": {
"filter": {
"my_stopwords": {
"type": "stop",
"stopwords": [ "l", "g" ]
},
"filter_amount": {
"type": "pattern_replace",
"pattern": "[\\d]+([\\.,][\\d]+)?",
"replacement": ""
}
},
"analyzer": {
"my_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [ "lowercase", "my_stopwords", "filter_amount"]
}}
}}}
其余的都是一样的。干杯!