我正在尝试使用elasticsearch来索引一些关于研究论文的数据。但我想点缀一下口音。对于intance,如果我使用:
GET /_analyze?tokenizer=standard&filter=asciifolding&text="Boletínes de investigaciónes"
我
{
"tokens": [
{
"token": "Bolet",
"start_offset": 1,
"end_offset": 6,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "nes",
"start_offset": 7,
"end_offset": 10,
"type": "<ALPHANUM>",
"position": 2
},
{
"token": "de",
"start_offset": 11,
"end_offset": 13,
"type": "<ALPHANUM>",
"position": 3
},
{
"token": "investigaci",
"start_offset": 14,
"end_offset": 25,
"type": "<ALPHANUM>",
"position": 4
},
{
"token": "nes",
"start_offset": 26,
"end_offset": 29,
"type": "<ALPHANUM>",
"position": 5
}
]
}
我应该得到类似的东西
{
"tokens": [
{
"token": "Boletines",
"start_offset": 1,
"end_offset": 6,
"type": "<ALPHANUM>",
"position": 1
},
{
"token": "de",
"start_offset": 11,
"end_offset": 13,
"type": "<ALPHANUM>",
"position": 3
},
{
"token": "investigacion",
"start_offset": 14,
"end_offset": 25,
"type": "<ALPHANUM>",
"position": 4
}
]
}
我该怎么办?