我关注官方ElasticSearch guide
我创建了以下索引以获得title和title.shingles(第二个使用自定义分析器生成2个单词的带状疱疹)
PUT /my_index
{
"settings": {
"number_of_shards": 1,
"analysis": {
"filter": {
"my_shingle_filter": {
"type": "shingle",
"min_shingle_size": 2,
"max_shingle_size": 2,
"output_unigrams": false
}
},
"analyzer": {
"my_shingle_analyzer": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"my_shingle_filter"
]
}
}
}
}
}
PUT /my_index/_mapping/my_type
{
"my_type": {
"properties": {
"title": {
"type": "text",
"fields": {
"shingles": {
"type": "text",
"analyzer": "my_shingle_analyzer",
"search_analyzer": "my_shingle_analyzer"
}
}
}
}
}
}
并将一些数据放入
POST /my_index/my_type/_bulk
{ "index": { "_id": 1 }}
{ "title": "Sue ate the alligator" }
{ "index": { "_id": 2 }}
{ "title": "The alligator ate Sue" }
{ "index": { "_id": 3 }}
{ "title": "Sue never goes anywhere without her alligator skin purse" }
现在我想根据title.shingles字段进行查询 这会产生结果:
GET /my_index/my_type/_search
{
"query": {
"match": {
"title.shingles": "ate sue"
}
}
}
这不会产生结果(空结果集)
GET /my_index/my_type/_search
{
"query": {
"match": {
"title.shingles": "the hungry alligator ate sue"
}
}
}
在搜索时似乎没有应用my_shingle_analyzer?我的假设是它应该和长文本应该被标记化'应该找到其中一个代币......
我已经通过使用ES分析命令验证了分析仪的工作原理
GET /my_index/_analyze?analyzer=my_shingle_analyzer
{
"text": "Sue ate the alligator"
}