我正在尝试编写一个自定义分析器,它会在特殊字符上打破令牌,并在编制索引之前将其转换为大写,如果我也使用小写字母搜索,我应该能够得到结果。
例如,如果我提供数据@ source - 它应该用空格替换@ - 它应该用空格替换任何特殊字符并给我结果如数据源。
以下是我尝试实施的方法。
PUT sound
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"char_filter": [
"my_char_filter"
],
"filter": [
"uppercase"
]
}
},
"char_filter": {
"my_char_filter": {
"type": "pattern_replace",
"pattern": "(\\d+)-(?=\\d)",
"replacement": "$1 "
}
}
}
}
}
POST sound/_analyze
{
"analyzer": "my_analyzer",
"text": "data-source&abc"
}
它很好地分割了令牌,比如 -
{
"tokens": [
{
"token": "DATA",
"start_offset": 0,
"end_offset": 4,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "SOURCE",
"start_offset": 5,
"end_offset": 11,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "ABC",
"start_offset": 12,
"end_offset": 15,
"type": "<ALPHANUM>",
"position": 2
}
]
}
但是如果我在这里用小写甚至大写搜索,它就不起作用了......就像:
GET sound/_search?text="data"
GET sound/_search?text="data"
GET /sound/_search
{
"query": {
"match": {
"text": "data"
}
}
}
如果我像上面的查询一样搜索,它就不会给我结果。
答案 0 :(得分:0)
您只需要为搜索使用稍微不同的语法:
GET sound/_search?q=data
GET sound/_search?q=data
POST sound/_search
{
"query": {
"match": {
"NAME_OF_YOUR_FIELD": "data"
}
}
}
NAME_OF_YOUR_FIELD
必须是您存储数据的字段的名称。match query here