我正在关注帖子Creating an index Nest并尝试更新我的索引设置。所有运行正常,但html_strip
过滤器没有剥离HTML。我的代码是
var node = new Uri(_url + ":" + _port);
var settings = new ConnectionSettings(node);
settings.SetDefaultIndex(index);
_client = new ElasticClient(settings);
//to apply filters during indexing use folding to remove diacritics and html strip to remove html
_client.UpdateSettings(
f = > f.Analysis(descriptor = > descriptor
.Analyzers(
bases = > bases
.Add("folded_word", new CustomAnalyzer
{
Filter = new List < string > { "icu_folding", "trim" },
Tokenizer = "standard"
}
)
)
.CharFilters(
cf = > cf.Add("html_strip", new HtmlStripCharFilter())
)
)
);
答案 0 :(得分:2)
您收到错误:
无法更新非动态 设置[index.analysis.analyzer.folded_word.filter.0, index.analysis.char_filter.html_strip.type, index.analysis.analyzer.folded_word.filter.1, index.analysis.analyzer.folded_word.type, index.analysis.analyzer.folded_word.tokenizer]]用于打开 指数[[my_index]]
在您尝试更新设置之前,先关闭索引,更新设置并在之后重新打开。 Have a look.
printf
<强> 更新 强>
将client.CloseIndex(..);
client.UpdateSettings(..);
client.OpenIndex(..);
字符过滤器添加到自定义分析器:
html_strip
现在您可以运行test来检查此分析器是否返回正确的令牌:
.Analysis(descriptor => descriptor
.Analyzers(bases => bases.Add("folded_word",
new CustomAnalyzer
{
Filter = new List<string> { "icu_folding", "trim" },
Tokenizer = "standard",
CharFilter = new List<string> { "html_strip" }
}))
)
输出:
client.Analyze(a => a.Index(indexName).Text("this <a> is a test <div>").Analyzer("folded_word"));
希望它有所帮助。