我有一个json结构,如下所示:
{"DocumentName":"es","DocumentId":"2","Content": [{"PageNo":1,"Text": "The full text queries enable you to search analyzed text fields such as the body of an email. The query string is processed using the same analyzer that was applied to the field during indexing."},{"PageNo":2,"Text": "The query string is processed using the same analyzer that was applied to the field during indexing."}]}
我需要获取Content.Text字段的分析结果。为此,我在创建索引时创建了一个映射,如下所示:
curl -X PUT "localhost:9200/myindex?pretty" -H "Content-Type: application/json" -d"{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"filter": ["lowercase", "my_stemmer"]
}
},
"filter": {
"my_stemmer": {
"type": "stemmer",
"name": "english"
}
}
}
}
}, {
"mappings": {
"properties": {
"DocumentName": {
"type": "text"
},
"DocumentId": {
"type": "keyword"
},
"Content": {
"properties": {
"PageNo": {
"type": "integer"
},
"Text": "_all": {
"type": "text",
"analyzer": "my_analyzer",
"search_analyzer": "my_analyzer"
}
}
}
}
}
}
}"
我检查了创建的分析仪:
curl -X GET "localhost:9200/myindex/_analyze?pretty" -H "Content-Type: application/json" -d"{\"analyzer\":\"my_analyzer\",\"text\":\"indexing\"}"
它给出了结果:
{
"tokens" : [
{
"token" : "index",
"start_offset" : 0,
"end_offset" : 8,
"type" : "<ALPHANUM>",
"position" : 0
}
]
}
但是将json上载到索引后,当我尝试搜索“索引”时,它将返回0个结果。
res = requests.get('http://localhost:9200')
es = Elasticsearch([{'host': 'localhost', 'port': '9200'}])
res= es.search(index='myindex', body={"query": {"match": {"Content.Text": "index"}}})
任何帮助将不胜感激。谢谢。
答案 0 :(得分:1)
忽略我的评论。词干正在工作。请尝试以下操作:
映射:
curl -X DELETE "localhost:9200/myindex"
curl -X PUT "localhost:9200/myindex?pretty" -H "Content-Type: application/json" -d'
{
"settings":{
"analysis":{
"analyzer":{
"english_exact":{
"tokenizer":"standard",
"filter":[
"lowercase"
]
}
}
}
},
"mappings":{
"properties":{
"DocumentName":{
"type":"text"
},
"DocumentId":{
"type":"keyword"
},
"Content":{
"properties":{
"PageNo":{
"type":"integer"
},
"Text":{
"type":"text",
"analyzer":"english",
"fields":{
"exact":{
"type":"text",
"analyzer":"english_exact"
}
}
}
}
}
}
}
}'
数据:
curl -XPOST "localhost:9200/myindex/_doc/1" -H "Content-Type: application/json" -d'
{
"DocumentName":"es",
"DocumentId":"2",
"Content":[
{
"PageNo":1,
"Text":"The full text queries enable you to search analyzed text fields such as the body of an email. The query string is processed using the same analyzer that was applied to the field during indexing."
},
{
"PageNo":2,
"Text":"The query string is processed using the same analyzer that was applied to the field during indexing."
}
]
}'
查询:
curl -XGET 'localhost:9200/myindex/_search?pretty' -H "Content-Type: application/json" -d '
{
"query":{
"simple_query_string":{
"fields":[
"Content.Text"
],
"query":"index"
}
}
}'
恰好返回了一个文档。我还测试了以下词干,它们都与建议的映射一起正常工作: apply (已应用),文本(文本), use (使用)。
Python示例:
import requests
from elasticsearch import Elasticsearch
res = requests.get('http://localhost:9200')
es = Elasticsearch([{'host': 'localhost', 'port': '9200'}])
res = es.search(index='myindex', body={"query": {"match": {"Content.Text": "index"}}})
print(res)
在Elasticsearch 7.4上进行了测试。