我尝试在关键字分析字段上应用html_strip和小写过滤器。搜索时我注意到搜索结果不符合预期。
这是我们尝试创建的索引
PUT /test_index
{
"settings": {
"number_of_shards": 5,
"number_of_replicas": 0,
"analysis": {
"analyzer": {
"ExportPrimaryAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": "lowercase",
"char_filter": "html_strip"
},
"ExportRawAnalyzer": {
"type": "custom",
"buffer_size": "1000",
"tokenizer": "keyword",
"filter": "lowercase",
"char_filter": "html_strip"
}
}
}
},
"mappings": {
"test_type": {
"properties": {
"city": {
"type": "string",
"analyzer" : "ExportPrimaryAnalyzer"
},
"city_raw":{
"type": "string",
"analyzer" : "ExportRawAnalyzer"
}
}
}
}
}
以下是数据示例:
PUT test_index/test_type/4
{
"city": "<p>I am from Pune</p>",
"city_raw": "<p>I am from Pune</p>"
}
当我们尝试使用通配符时,我们没有得到结果。以下是我们试图解决的问题。
{
"query": {
"wildcard": {
"city_raw": "i am*"
}
}
}
任何帮助表示赞赏
答案 0 :(得分:0)
html_strip_filter
会用new-lines
替换html块元素。
因此,如果您使用keyword-tokenizer
,则需要使用其他过滤器将new-lines
替换为空字符串。
示例:
PUT test
{
"settings": {
"number_of_shards": 5,
"number_of_replicas": 0,
"analysis": {
"char_filter": {
"remove_new_line": {
"type": "mapping",
"mappings": [
"\\n =>"
]
}
},
"analyzer": {
"ExportPrimaryAnalyzer": {
"type": "custom",
"tokenizer": "whitespace",
"filter": [
"lowercase"
],
"char_filter": [
"html_strip"
]
},
"ExportRawAnalyzer": {
"type": "custom",
"buffer_size": "1000",
"tokenizer": "keyword",
"filter": [
"lowercase"
],
"char_filter": [
"html_strip",
"remove_new_line"
]
}
}
}
},
"mappings": {
"test_type": {
"properties": {
"city": {
"type": "string",
"analyzer": "ExportPrimaryAnalyzer"
},
"city_raw": {
"type": "string",
"analyzer": "ExportRawAnalyzer"
}
}
}
}
}
PUT test/test_type/4
{
"city": "<p>I am from Bangalore I like Pune too</p>",
"city_raw": "<p>I am from Bangalore I like Pune too</p>"
}
post test/_search
{
"query": {
"wildcard": {
"city_raw": "i am *"
}
}
}
结果:
"hits": [
{
"_index": "test",
"_type": "test_type",
"_id": "4",
"_score": 1,
"_source": {
"city": "<p>I am from Bangalore I like Pune too</p>",
"city_raw": "<p>I am from Bangalore I like Pune too</p>"
}
}
]