我们使用弹性搜索(ES)上的两种类型的文档:项目和插槽,其中项目是插槽文档的父项。 我们使用以下命令定义索引:
curl -XPOST 'localhost:9200/items' -d @itemsdef.json
其中itemsdef.json
具有以下定义
{
"mappings" : {
"item" : {
"properties" : {
"id" : {"type" : "long" },
"name" : {
"type" : "string",
"_analyzer" : "textIndexAnalyzer"
},
"location" : {"type" : "geo_point" },
}
}
},
"settings" : {
"analysis" : {
"analyzer" : {
"activityIndexAnalyzer" : {
"alias" : ["activityQueryAnalyzer"],
"type" : "custom",
"tokenizer" : "whitespace",
"filter" : ["trim", "lowercase", "asciifolding", "spanish_stop", "spanish_synonym"]
},
"textIndexAnalyzer" : {
"type" : "custom",
"tokenizer" : "whitespace",
"filter" : ["word_delimiter_impl", "trim", "lowercase", "asciifolding", "spanish_stop", "spanish_synonym"]
},
"textQueryAnalyzer" : {
"type" : "custom",
"tokenizer" : "whitespace",
"filter" : ["trim", "lowercase", "asciifolding", "spanish_stop"]
}
},
"filter" : {
"spanish_stop" : {
"type" : "stop",
"ignore_case" : true,
"enable_position_increments" : true,
"stopwords_path" : "analysis/spanish-stopwords.txt"
},
"spanish_synonym" : {
"type" : "synonym",
"synonyms_path" : "analysis/spanish-synonyms.txt"
},
"word_delimiter_impl" : {
"type" : "word_delimiter",
"generate_word_parts" : true,
"generate_number_parts" : true,
"catenate_words" : true,
"catenate_numbers" : true,
"split_on_case_change" : false
}
}
}
}
}
然后我们使用以下命令添加子文档定义:
curl -XPOST 'localhost:9200/items/slot/_mapping' -d @slotsdef.json
slotsdef.json
具有以下定义:
{
"slot" : {
"_parent" : {"type" : "item"},
"_routing" : {
"required" : true,
"path" : "parent_id"
},
"properties": {
"id" : { "type" : "long" },
"parent_id" : { "type" : "long" },
"activity" : {
"type" : "string",
"_analyzer" : "activityIndexAnalyzer"
},
"day" : { "type" : "integer" },
"start" : { "type" : "integer" },
"end" : { "type" : "integer" }
}
}
}
最后,我们使用以下命令执行批量索引:
curl -XPOST 'localhost:9200/items/_bulk' --data-binary @testbulk.json
testbulk.json保存以下数据:
{"index":{"_type": "item", "_id":35}}
{"location":[40.4,-3.6],"id":35,"name":"A Name"}
{"index":{"_type":"slot","_id":126,"_parent":35}}
{"id":126,"start":1330,"day":1,"end":1730,"activity":"An Activity","parent_id":35}
我正在尝试进行以下查询:搜索指定日期内以及某些开始和结束范围内具有子项(广告位)的位置的特定距离内的所有项目。
具有更多符合条件的插槽的项目应该得分更高。
我尝试从现有样本开始,但文档非常稀缺,很难继续前进。
线索?
答案 0 :(得分:0)
我认为没有办法编写一个有效的查询,可以做这样的事情而不将位置移动到插槽。你可以做这样的事情,但对于某些数据它可能效率很低:
{
"query": {
"top_children" : {
"type": "blog_tag",
"query" : {
"constant_score" : {
"query" : {
... your query for children goes here ...
}
}
},
"score" : "sum",
"factor" : 5,
"incremental_factor" : 2
}
},
"filter": {
"geo_distance" : {
"distance" : "200km",
"location" : {
"lat" : 40,
"lon" : -70
}
}
}
}
}
基本上,这个查询正在做的是,它需要您的范围查询或过滤器以及您需要的其他条件,并将其包装到constant_score查询中以确保所有子项的得分均为1.0。 top_children
查询会收集所有这些孩子,并将他们的分数累积到父母身上。然后过滤掉过远的父母。