我有一个定义了以下映射的字段:
"my_field": {
"properties": {
"address": {
"type": "string",
"analyzer": "email",
"search_analyzer": "whitespace"
}
}
}
我的电子邮件分析器如下所示:
{
"analysis": {
"filter": {
"email_filter": {
"type": "edge_ngram",
"min_gram": "3",
"max_gram": "255"
}
},
"analyzer": {
"email": {
"type": "custom",
"filter": [
"lowercase",
"email_filter",
"unique"
],
"tokenizer": "uax_url_email"
}
}
}
}
当我尝试搜索电子邮件ID时,例如test.xyz@example.com
搜索tes,test.xy等术语并不起作用。但是,如果我搜索 test.xyz或test.xyz@example.com,它工作正常。我尝试使用我的电子邮件过滤器分析令牌,它按预期正常工作
实施例。点击http://localhost:9200/my_index/_analyze?analyzer=email&text=test.xyz@example.com
我明白了:
{
"tokens": [{
"token": "tes",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.x",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.xy",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.xyz",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.xyz@",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.xyz@e",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.xyz@ex",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.xyz@exa",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.xyz@exam",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.xyz@examp",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.xyz@exampl",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.xyz@example",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.xyz@example.",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.xyz@example.c",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.xyz@example.co",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}, {
"token": "test.xyz@example.com",
"start_offset": 0,
"end_offset": 20,
"type": "word",
"position": 0
}]
}
所以我知道令牌化是有效的。但是在搜索时,它无法搜索部分字符串。
对于前。正在寻找http://localhost:9200/my_index/my_field/_search?q=test,结果显示没有点击。
我的索引详情:
{
"my_index": {
"aliases": {
"alias_default": {}
},
"mappings": {
"my_field": {
"properties": {
"address": {
"type": "string",
"analyzer": "email",
"search_analyzer": "whitespace"
},
"boost": {
"type": "long"
},
"createdat": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
},
"instanceid": {
"type": "long"
},
"isdeleted": {
"type": "integer"
},
"object": {
"type": "string"
},
"objecthash": {
"type": "string"
},
"objectid": {
"type": "string"
},
"parent": {
"type": "short"
},
"parentid": {
"type": "integer"
},
"updatedat": {
"type": "date",
"format": "strict_date_optional_time||epoch_millis"
}
}
}
},
"settings": {
"index": {
"creation_date": "1480342980403",
"number_of_replicas": "1",
"max_result_window": "100000",
"uuid": "OUuiTma8CA2VNtw9Og",
"analysis": {
"filter": {
"email_filter": {
"type": "edge_ngram",
"min_gram": "3",
"max_gram": "255"
},
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": "3",
"max_gram": "20"
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"filter": [
"lowercase",
"autocomplete_filter"
],
"tokenizer": "standard"
},
"email": {
"type": "custom",
"filter": [
"lowercase",
"email_filter",
"unique"
],
"tokenizer": "uax_url_email"
}
}
},
"number_of_shards": "5",
"version": {
"created": "2010099"
}
}
},
"warmers": {}
}
}
答案 0 :(得分:1)
好的,除了你的查询,一切看起来都是正确的。
您只需在查询中指定address
字段,就可以了:
http://localhost:9200/my_index/my_field/_search?q=address:test
如果您未指定address
字段,则查询将在默认情况下搜索分析器为_all
的{{1}}字段上运行,因此您找不到原因任何东西。