我在Elasticsearch 1.7中有一个电子邮件分析器,我希望将整个字符串处理成电子邮件,而不是以任何方式拆分它们。但是,电子邮件输入会以@字符分割。
这是我的模板
{
"template": "someindex*",
"settings": {
"index.analysis.filter.length-filter.min": "8",
"index.analysis.analyzer.default.stopwords": "_none_",
"index.analysis.filter.length-filter.type": "length",
"index.analysis.filter.length-filter.max": "4999",
"index.mapper.dynamic": "true",
"index.analysis.analyzer.default.type": "standard",
"index.analysis.analyzer.email-analyzer.filter" : ["lowercase","unique"],
"index.analysis.analyzer.email-analyzer.type" : "custom",
"index.analysis.tokenizer.email-tokenizer.type" : "uax_url_email",
"index.analysis.analyzer.email-analyzer.tokenizer" : "email-tokenizer"
},
"mappings": {
"_default_": {
"properties": {
"email": {
"index_analyzer" : "email-analyzer",
"search_analyzer" : "email-analyzer",
"type" : "string",
"fields" : {
"raw" : {
"index" : "not_analyzed",
"ignore_above" : 256,
"type" : "string"
}
}
}
},
"_all": {
"enabled": true,
"omit_norms": true
}
}
},
"aliases": {
"someindex": {}
}
}
当我执行此
时$ curl -XGET 'http://localhost:9200/someindex/_analyze?analyzer=email-analyzer' -d 'test.me@gmail.com'
{"tokens":[{"token":"test.me","start_offset":0,"end_offset":7,"type":"<ALPHANUM>","position":1},{"token":"gmail.com","start_offset":8,"end_offset":17,"type":"<ALPHANUM>","position":2}]}
虽然我已为该特定分析器定义了uax_url_email标记器,但我发现电子邮件已被拆分。
我在这里做错了什么?
感谢您的帮助! 人