我正在尝试对Elasticsearch中存储的记录实施搜索查询。 记录结构看起来像这样。
{
"_index" : "box_info_store",
"_type" : "boxes",
"_id" : "pWjQLWkBIJk0ORjd0X2P",
"_score" : null,
"_source" : {
"transactionID" : "60ab66cf24c9924f562bf1a2b5d92305d0a6",
"boxNumber" : "Box3",
"createDate" : "2013-09-17T00:00:00",
"itemNumber" : "Item1",
"address" : "Sample Address"
}
}
一个盒子可以包含多个物品。例如Box3可以具有Item1,Item2和Item3。所以在elasticsearch中,我将有3个不同的文档。同样,也可以同时存在相同的框和相同的项目,但地址不同。这些文件的transactionID可以相同或可以不相同。
我的要求是获取最近n个不同的最近的transactionID及其记录。
我尝试了以下查询以获取最后7个不同的transactionID
GET /box_info_store/boxes/_search?size=7
{
"query": {
"bool": {
"must": [
{"match":{"boxNumber":"Box3"}},
{"match":{"itemNumber":"Item1"}}
]
}
},
"sort": [
{
"createDate": {
"order": "desc"
}
}
],
"aggs": {
"distinct_transactions": {
"terms": { "field": "transactionID"}
}
}
}
这最后获取了我7个文档,其中boxNumber是Box3,itemNumber是Item1,但不是7个不同的transactionID,这七个文档中有两个具有相同的transactionID(尽管两者都有单独的地址)。 但是我的要求是,无论返回多少文档,都要获取7个不同的transactionId。
希望我能解释自己。 在这里感谢任何帮助
谢谢
------编辑@ gaurav9620,我运行第一个查询,计数为32,然后运行第二个查询,计数为3,得到以下结果
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 32,
"max_score" : null,
"hits" : [
{
"_index" : "box_info_store",
"_type" : "boxes",
"_id" : "RWjRLWkBIJk0ORjdEX-L",
"_score" : null,
"_source" : {
"transactionID" : "3087e106244f6247a5290fb21ce64254529c",
"boxNumber" : "Box3",
"createDate" : "2017-11-15T00:00:00",
"itemNumber" : "Item1",
"address" : "sampleAddress12",
},
"sort" : [
1510704000000
]
},
{
"_index" : "box_info_store",
"_type" : "boxes",
"_id" : "MGjQLWkBIJk0ORjdwX0M",
"_score" : null,
"_source" : {
"transactionID" : "60ab66cf24c9924f562bf1a2b5d92305d0a6",
"boxNumber" : "Box3",
"createDate" : "2016-04-03T00:00:00",
"itemNumber" : "Item1",
"address" : "sampleAddress321",
},
"sort" : [
1459641600000
]
},
..........
..........
..........
{
"_index" : "box_info_store",
"_type" : "boxes",
"_id" : "AGjRLWkBIJk0ORjdK4CJ",
"_score" : null,
"_source" : {
"transactionID" : "3087e106244f6247a5290fb21ce64254529c",
"boxNumber" : "Box3",
"createDate" : "1996-02-16T00:00:00",
"itemNumber" : "Item1",
"address" : "sampleAddress4324",
},
"sort" : [
824428800000
]
}
]
},
"aggregations" : {
"unique_transactions" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 16,
"buckets" : [
{
"key" : "3087e106244f6247a5290fb21ce64254529c",
"doc_count" : 6
},
{
"key" : "27c5f3422f4482495d29e7b2c15c0e311743",
"doc_count" : 5
},
{
"key" : "c40e53212e74e24bf02a5bd2b134cf92bffb",
"doc_count" : 5
}
]
}
}
}
答案 0 :(得分:0)
您使用的大小:代表的原始文档数。
如果您的情况是:
在聚合中包含一个size参数,它将返回唯一的7个ID。
GET / box_info_store / boxes / _search?size = 7 { “查询”:{ “布尔”:{ “必须”:[ { “比赛”: { “ boxNumber”:“ Box3” } }, { “比赛”: { “ itemNumber”:“ Item1” } } ] } }, “排序”:[ { “ createDate”:{ “ order”:“ desc” } } ], “ ags”:{ “ distinct_transactions”:{ “条款”:{ “ field”:“ transactionID”, “大小”:7 } } } }
编辑------------------------------------- >
首先触发此查询
GET /box_info_store/boxes/_search?size=0
{
"query": {
"bool": {
"must": [
{
"match": {
"boxNumber": "Box3"
}
},
{
"match": {
"itemNumber": "Item1"
}
}
]
}
}
}
在这里您将找到与查询匹配的文档总数,可以将其设置为 n 之后,触发您的查询,如下所示
GET /box_info_store/boxes/_search?size=**n**
{
"query": {
"bool": {
"must": [
{
"match": {
"boxNumber": "Box3"
}
},
{
"match": {
"itemNumber": "Item1"
}
}
]
}
},
"sort": [
{
"createDate": {
"order": "desc"
}
}
],
"aggs": {
"distinct_transactions": {
"terms": {
"field": "transactionID",
"size": NUMBER_OF_UNIQUE_TRANSACTION_IDS_TO_BE_FETCHED
}
}
}
}