首先,我没有经常使用Elasticsearch,所以,我提前为愚蠢的查询道歉; - )。
我目前正在为编辑工作。
我们的ES信息如下:
{
"name" : "Lockjaw",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "UUID",
"version" : {
"number" : "2.4.6",
"build_hash" : "5376dca9f70f3abef96a77f4bb22720ace8240fd",
"build_timestamp" : "2017-07-18T12:17:44Z",
"build_snapshot" : false,
"lucene_version" : "5.5.4"
},
"tagline" : "You Know, for Search"
}
一些背景信息。我们的后端运行在WordPress上,我们正在使用标签。好的,这很标准,但是,有些标签标有" is_topic"。所以,我试图实现的目标如下。
问题1。 当用户保存帖子时,系统应根据标签和主题查找相关帖子。标签比主题更重要。所以我尝试了以下查询:
"query":{
"filtered":{
"filter":{
"bool":{
"must":[
{
"term":{
"post_type":"post"
}
},
{
"range":{
"post_date":{
"gte":"2018-06-15 09:00:00"
}
}
}
],
"should":[
{
"match":{
"terms.post_tag.term_id":{
"query":[
38,
11642
],
"boost":1
}
}
},
{
"match":{
"terms.post_tag.term_id":{
"query":[
1133,
8708,
27774
],
"boost":2
}
}
}
]
}
}
}
},
"size":5
在上面的查询中,第一个"应该"是我的主题,第二个"应该"是我的标签。我在这里收到错误:
{
"error":{
"root_cause":[
{
"type":"query_parsing_exception",
"reason":"[match] unknown token [START_ARRAY] after [query]",
"index":"myindexname",
"line":1,
"col":199
}
],
"type":"search_phase_execution_exception",
"reason":"all shards failed",
"phase":"query",
"grouped":true,
"failed_shards":[
{
"shard":0,
"index":"myindexname",
"node":"oVegK7J9Tf6T-IXRUXGYvg",
"reason":{
"type":"query_parsing_exception",
"reason":"[match] unknown token [START_ARRAY] after [query]",
"index":"myindexname",
"line":1,
"col":199
}
}
]
},
"status":400
}
以下是应该找到的示例文档:
"post_id" : 477398,
"post_date" : "2018-02-28 08:00:00",
"post_date_gmt" : "2018-02-28 07:00:00",
"post_title" : "Article Title",
"post_excerpt" : "",
"post_content" : "Content Here",
"post_status" : "publish",
"post_name" : "article-title",
"post_type" : "post",
"post_mime_type" : "",
"permalink" : "https://www.example.com/archive/2018/02/28/article-title/",
"terms" : {
"category" : [ {
"term_id" : 1,
"slug" : "artikelen",
"name" : "Alle artikelen",
"parent" : 0
}, {
"term_id" : 15035,
"slug" : "commerce",
"name" : "Commerce",
"parent" : 0
} ],
"post_tag" : [ {
"term_id" : 29297,
"slug" : "custom-labels",
"name" : "Custom labels",
"parent" : 0
}, {
"term_id" : 38,
"slug" : "e-commerce",
"name" : "E-commerce",
"parent" : 0
}, {
"term_id" : 2345,
"slug" : "google-adwords",
"name" : "Google AdWords",
"parent" : 0
}, {
"term_id" : 11642,
"slug" : "google-shopping",
"name" : "Google Shopping",
"parent" : 0
}, {
"term_id" : 1133,
"slug" : "webshops",
"name" : "Webshops",
"parent" : 0
} ],
"post-content-type" : [ {
"term_id" : 8708,
"slug" : "strategie",
"name" : "Strategie",
"parent" : 0
} ],
"sector" : [ {
"term_id" : 27774,
"slug" : "retail-webshops",
"name" : "Retail & Webshops",
"parent" : 0
} ]
}
映射如下:对于此字段:
"terms" : {
"properties" : {
"post_tag" : {
"properties" : {
"name" : {
"type" : "string",
"fields" : {
"raw" : {
"type" : "string",
"index" : "not_analyzed"
},
"sortable" : {
"type" : "string",
"analyzer" : "ewp_lowercase"
}
}
},
"parent" : {
"type" : "long"
},
"slug" : {
"type" : "string",
"index" : "not_analyzed"
},
"term_id" : {
"type" : "long"
}
}
}
}
}
有人可以帮我格式化这个查询吗?或者这甚至不可能?
问题2: 有些文件不应该退回。我们正在为此功能设置黑名单,这是一个带有must_not查询的简单功能。但是,这是棘手的部分。正如您在我的查询中所看到的,我对范围进行了过滤,因此只有具有post_date<相对于新帖子日期返回2年。但是,我们还想要一份2年以上文档的白名单。这甚至可能吗?怎么样?
提前谢谢!