我已经找到了答案,但遗憾的是找不到答案。 我有一个包含用户类型的索引:
users: {
properties: {
loginKey: {
type: string
}
timeZone: {
type: long
}
maxEmailsPerWeek: {
type: long
}
joinDate: {
format: dateOptionalTime
type: date
}
preferredEntityId: {
type: long
}
partition: {
type: long
}
postalCode: {
type: string
}
nickName: {
type: string
}
announcements: {
type: long
}
gender: {
type: string
}
birthDate: {
format: dateOptionalTime
type: date
}
firstName: {
type: string
}
emailTestId: {
type: long
}
emailStateDate: {
format: dateOptionalTime
type: date
}
lastName: {
type: string
}
emailAddress: {
type: string
}
...
}
}
并且有一种用户的活动:
activity: {
_routing: {
required: true
}
properties: {
eventTimestamp: {
format: dateOptionalTime
type: date
}
userAgent: {
type: string
}
recordType: {
type: string
}
universalTrackingParams: {
properties: {
MODULE_ID: {
type: string
}
TRACKING_CODE: { // this is a unique user identifier
index: not_analyzed
omit_norms: true
index_options: docs
type: string
}
SENDING_DOMAIN_PARAM: {
index: not_analyzed
omit_norms: true
index_options: docs
type: string
}
PRODUCT_ID: {
type: string
}
TEST_ID: {
type: string
}
MAILING_ID: {
type: string
}
NEWS_LETTER_ID: {
type: string
}
LINK_POSITION: {
type: integer
}
DECORATION_TIMESTAMP: {
type: string
}
SITE_ID: {
type: string
}
TEMPLATE_VERSION: {
type: string
}
ORIGINAL_LINK: {
index: not_analyzed
omit_norms: true
index_options: docs
type: string
}
}
}
ip: {
index: not_analyzed
omit_norms: true
index_options: docs
type: string
}
}
_parent: {
type: users
}
}
我想要做的是搜索拥有N
个孩子的所有父母,换句话说我想要获得所有有活动的用户记录(超过N
次)在给定的时间段内(eventTimestamp
)
有人可以建议我可以阅读的资源或可以实现该资源的查询
更新 所以这就是我为此所做的(使用下面由Sloan Ahrens创建的索引和类型):
{
"min_score": 2,
"query": {
"top_children": {
"type": "order",
"score": "sum",
"query": {
"constant_score": {
"query": {
"match_all": {}
}
}
}
}
}
}
这将使我所有至少有3个订单的客户(感谢imotov)
答案 0 :(得分:2)
嗯,这肯定不是一个完全令人满意的解决方案,因为它需要两个查询,但我认为你可以使用方面得到你想要的。
简化一点(并使用来自this blog post的架构/数据),我将首先创建一个具有父/子关系的简单索引:
curl -XPUT "http://localhost:9200/orders" -d'
{
"mappings": {
"customer": {},
"order" : {
"_parent" : {
"type" : "customer"
}
}
}
}'
然后添加一些数据:
curl -XPOST "http://localhost:9200/orders/_bulk" -d'
{ "index" : { "_type" : "customer", "_id" : "john" } }
{ "name" : "John Doe" }
{ "index" : { "_type" : "order", "_parent" : "john" } }
{ "date" : "2013-10-15T12:00:00" }
{ "index" : { "_type" : "order", "_parent" : "john" } }
{ "date" : "2013-11-15T12:00:00" }
{ "index" : { "_type" : "order", "_parent" : "john" } }
{ "date" : "2013-12-01T12:00:00" }
{ "index" : { "_type" : "customer", "_id" : "jane" } }
{ "name" : "Jane Doe" }
{ "index" : { "_type" : "order", "_parent" : "jane" } }
{ "date" : "2013-11-20T12:00:00" }
{ "index" : { "_type" : "customer", "_id" : "bob" } }
{ "name" : "Bob Doe" }
{ "index" : { "_type" : "order", "_parent" : "bob" } }
{ "date" : "2013-09-20T12:00:00" }
'
然后我可以在order
字段上面对"_parent"
,过滤date
上面临的文档:
curl -XPOST "http://localhost:9200/orders/order/_search " -d'
{
"size": 0,
"facets": {
"customers": {
"terms": {
"field": "_parent"
},
"facet_filter": {
"range": {
"date": {
"from": "2013-11-01T00:00:00"
}
}
}
}
}
}'
给了我以下回复:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 5,
"max_score": 1,
"hits": []
},
"facets": {
"customers": {
"_type": "terms",
"missing": 0,
"total": 3,
"other": 0,
"terms": [
{
"term": "customer#john",
"count": 2
},
{
"term": "customer#jane",
"count": 1
}
]
}
}
}
然后,我可以使用返回的ID检索customer
和第二个查询:
curl -XPOST "http://localhost:9200/orders/_search" -d'
{
"query": {
"ids": {
"type": "customer",
"values": [
"john",
"jane"
]
}
}
}'
您必须在最后两个步骤之间添加自己的逻辑,以根据结果计数确定要检索的客户,但您可以在此上下文中使用此方法。
以下是您可以使用的可运行示例:http://sense.qbox.io/gist/9ebde72ccffa0dce654383ad4fb0a8451b74a9f7