我想过滤并从弹性搜索中获取数据。我曾尝试过日期直方图聚合,但它不能解决我的目的。 我有类似的数据:
[
{
"id":1,
"title":"Sample news",
"date":"2020-09-17",
"regulation":[
{
"id":1,
"name":"sample name",
"date":"2020-09-17"
},
{
"id":2,
"name":"sample name 1",
"date":"2020-09-18"
}
]
},
{
"id":2,
"title":"Sample news 1",
"date":"2020-09-17",
"regulation":[
{
"id":1,
"name":"sample name",
"date":"2020-09-18"
},
{
"id":2,
"name":"sample name 1",
"date":"2020-09-17"
}
]
}
]
我想过滤并获取以下数据:
year: {
month: {
day: {
news: int,
regulations: int,
}
}
}
这意味着每天的新闻和法规都算作日期层次结构。 我可以达到这样的数据:
"2020-09-17" : {
"key_as_string" : "2020-09-17",
"key" : 1600300800000,
"doc_count" : 1
},
"2020-09-18" : {
"key_as_string" : "2020-09-18",
"key" : 1600387200000,
"doc_count" : 0
},
"2020-09-19" : {
"key_as_string" : "2020-09-19",
"key" : 1600473600000,
"doc_count" : 0
},
使用
GET /news/_search?size=0
{
"aggs": {
"news_over_time": {
"date_histogram": {
"field": "date",
"calendar_interval": "day",
"keyed": true,
"format": "yyy-MM-dd"
}
}
}
}
但这并不能解决我的目的。 如何使用Elasticsearch和Elasticsearch dsl
预期的响应: 预期响应:
2020: {
09: {
17: {
news: 2,
regulation: 2
},
18: {
news: 0,
regulation: 2
}
}
}
答案 0 :(得分:2)
我不确定您的预期响应是什么,但是如果您想获得每天的新闻数量,这就是您想要的
GET /news/_search?size=0
{
"aggs": {
"news_over_time": {
"date_histogram": {
"field": "regulation.date",
"calendar_interval": "day",
"format": "yyy-MM-dd"
}
}
}
}
答案 1 :(得分:1)
由于新闻日期和法规日期是2个不同的字段,其中一个属于父文档,另一个属于嵌套文档。我不确定我们是否可以直接满足您的要求(我自己也正在为此寻求帮助)。但是,下面的查询也应该对您有用。
GET news/_search
{
"size": 0,
"aggs": {
"news_over_time": {
"date_histogram": {
"field": "date",
"calendar_interval": "day",
"keyed": true,
"format": "yyy-MM-dd"
}
},"regulations_over_time":{
"nested": {
"path": "regulation"
},"aggs": {
"regulation": {
"date_histogram": {
"field": "regulation.date",
"calendar_interval": "day",
"keyed": true,
"format": "yyy-MM-dd"
}
}
}
}
}
}
它将以以下形式提供结果:
"aggregations" : {
"regulations_over_time" : { //<=== Regulations over time based on regulationDate
"doc_count" : 9,
"regulation" : {
"buckets" : {
"2020-09-17" : {
"key_as_string" : "2020-09-17",
"key" : 1600300800000,
"doc_count" : 5
},
"2020-09-18" : {
"key_as_string" : "2020-09-18",
"key" : 1600387200000,
"doc_count" : 4
}
}
}
},
"news_over_time" : { //<======= news over time based on news date
"buckets" : {
"2020-09-17" : {
"key_as_string" : "2020-09-17",
"key" : 1600300800000,
"doc_count" : 2
},
"2020-09-18" : {
"key_as_string" : "2020-09-18",
"key" : 1600387200000,
"doc_count" : 2
}
}
}
}
}
然后,您可以根据需要将这两个统计信息合并在一起。