我正在努力找出为什么这个简单的例子不起作用。 我将正则表达式简化为简单的例子,因为它们根本不起作用。
{
"settings" : {
"number_of_shards": 1,
"number_of_replicas": 0,
"index": {
"analysis": {
"char_filter" : {
"my_pattern" :{
"type": "pattern_replace",
"pattern": "a",
"replacement": "u"
}
},
"analyser": {
"my_analyser": {
"type": "custom",
"tokenizer": "whitespace",
"char_filter": ["my_pattern"]
}
}
}
}
},
"mappings" : {
"my_type" : {
"_source": {
"enabled": true
}
}
},
"properties": {
"test": {
"type": "string",
"store": true,
"index": "analysed",
"analyser": "my_analyser",
"index_options": "positions"
}
}
}'
感谢您的帮助
我索引了一个词:“挂”
$ curl -XGET 'http://localhost:9200/tm_de_fr/my_type/_search?q=hang&pretty=true'
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 0.30685282,
"hits" : [ {
"_index" : "tm_de_fr",
"_type" : "my_type",
"_id" : "-DWWF4kPR7S2YwZeyIsdVQ",
"_score" : 0.30685282,
"_source":{ "test": "hang" }
} ]
}
}
和
$ curl -XGET 'http://localhost:9200/tm_de_fr/my_type/_search?q=hung&pretty=true'
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"hits" : {
"total" : 0,
"max_score" : null,
"hits" : [ ]
}
}
我不确定_source是否也会改变,但索引数据和_source都没有改变。我希望“挂”是“挂”。
$ curl -XGET 'http://localhost:9200/tm_de_fr/my_type/-DWWF4kPR7S2YwZeyIsdVQ?pretty=true'
{
"_index" : "tm_de_fr",
"_type" : "my_type",
"_id" : "-DWWF4kPR7S2YwZeyIsdVQ",
"_version" : 1,
"found" : true,
"_source":{ "test": "hang" }
}
答案 0 :(得分:7)
您的映射不正确,您需要使用分析器的美国拼写:
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0,
"index": {
"analysis": {
"char_filter": {
"my_pattern": {
"type": "pattern_replace",
"pattern": "a",
"replacement": "u"
}
},
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"char_filter": [
"my_pattern"
]
}
}
}
}
},
"mappings": {
"my_type": {
"properties": {
"test": {
"type": "string",
"analyzer": "my_analyzer",
"index_options": "positions"
}
}
}
}
}
使用analyze API:
curl -XGET 'localhost:9200/test/_analyze?analyzer=my_analyzer&pretty=true' -d 'aaaa'
返回:
{
"tokens" : [ {
"token" : "uuuu",
"start_offset" : 0,
"end_offset" : 4,
"type" : "<ALPHANUM>",
"position" : 1
} ]
}