现在我有一个集合col
,其中包含以下文档:
{
"_id": 1,
"shares": [{
"fundcode": "000001",
"lastshares": 1230.20,
"agencyno": "260",
"netno": "260"
},{
"fundcode": "000002",
"lastshares": 213124.00,
"agencyno": "469",
"netno": "001"
},{
"fundcode": "000003",
"lastshares": 10000.80,
"agencyno": "469",
"netno": "002"
}
],
"trade": [{
"fundcode": "000001",
"c_date": "20160412",
"agencyno": "260",
"netno": "260",
"bk_tradetype": "122",
"confirmbalance": 1230.20,
"cserialno": "10110000119601",
"status": "1"
},{
"fundcode": "000002",
"c_date": "20160506",
"agencyno": "469",
"netno": "001",
"bk_tradetype": "122",
"confirmbalance": 213124.00,
"cserialno": "10110000119602",
"status": "1"
},{
"fundcode": "000003",
"c_date": "20170507",
"agencyno": "469",
"netno": "002",
"bk_tradetype": "122",
"confirmbalance": 10000.80,
"netvalue": 1.0000,
"cserialno": "10110000119602",
"status": "1"
}
]
}
如何使用mongodb查询实现类似以下sql的选择?:
SELECT _id
FROM col
WHERE col.shares.lastshares > 1000
AND col.trade.agencyno = '469'
GROUP BY _id
HAVING COUNT(DISTINCT col.shares.fundcode) > 2
AND COUNT(DISTINCT col.trade.fundcode) > 2
我曾两次尝试$unwind
,$groupby
,$match
汇总管道,但我没有得到正确答案。谢谢你的帮助。
答案 0 :(得分:0)
提供的样本不符合条件并没有什么帮助,但当然只是因为"trade"
数组只会产生2
个不同的匹配,这不足以满足*“查询中大于2“的约束。
结构肯定与RDBMS不同,因此“子查询”不适用,但至少你制作了这些数组。但理想情况下,我们根本不会使用$unwind
。
因此,我们需要做的就是“计算”数组中的“不同”匹配。这基本上可以在使用$redact
,$map
和$setDifference
作为主要操作的$size
内应用:
db.getCollection('collection').aggregate([
{ "$match": {
"shares.lastshares": { "$gt": 1000 },
"trade.agencyno": "469"
}},
{ "$redact": {
"$cond": {
"if": {
"$and": [
{ "$gt": [
{ "$size": {
"$setDifference": [
{ "$map": {
"input": "$shares",
"as": "el",
"in": {
"$cond": {
"if": { "$gt": [ "$$el.lastshares", 1000 ] },
"then": "$$el.fundcode",
"else": false
}
}
}},
[false]
]
}},
2
]},
{ "$gt": [
{ "$size": {
"$setDifference": [
{ "$map": {
"input": "$trade",
"as": "el",
"in": {
"$cond": {
"if": { "$eq": [ "$$el.agencyno", "469" ] },
"then": "$$el.fundcode",
"else": false
}
}
}},
[false]
]
}},
2
]}
]
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}},
/*
{ "$addFields": {
"shares": {
"$filter": {
"input": "$shares",
"as": "el",
"cond": { "$gt": [ "$$el.lastshares", 1000 ] }
}
},
"trade": {
"$filter": {
"input": "$trade",
"as": "el",
"cond": { "$eq": [ "$$el.agencyno", "469" ] }
}
}
}}
*/
])
这使得它基本上与MongoDB 2.6及更高版本兼容,并且只在那里添加$addFields
,所以你至少可以看到“过滤器”的结果,但它不需要,因为那不是查询的内容在问题要求中,实际上“只是文档_id
”,但只返回整个文档需要的工作量较少。如果你真的想要的话,最后只为_id
添加$project
。
另外,为了品尝你可以使用$filter
而不是MongoDB 3.x版本,但这种情况下的语法实际上要长一点:
db.getCollection('collection').aggregate([
{ "$match": {
"shares.lastshares": { "$gt": 1000 },
"trade.agencyno": "469"
}},
{ "$redact": {
"$cond": {
"if": {
"$and": [
{ "$gt": [
{ "$size": {
"$setDifference": [
{ "$map": {
"input": {
"$filter": {
"input": "$shares",
"as": "el",
"cond": { "$gt": [ "$$el.lastshares", 1000 ] }
}
},
"as": "el",
"in": "$$el.fundcode"
}},
[]
]
}},
2
]},
{ "$gt": [
{ "$size": {
"$setDifference": [
{ "$map": {
"input": {
"$filter": {
"input": "$trade",
"as": "el",
"cond": { "$eq": [ "$$el.agencyno", "469" ] }
}
},
"as": "el",
"in": "$$el.fundcode"
}},
[]
]
}},
2
]}
]
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}},
/*
{ "$addFields": {
"shares": {
"$filter": {
"input": "$shares",
"as": "el",
"cond": { "$gt": [ "$$el.lastshares", 1000 ] }
}
},
"trade": {
"$filter": {
"input": "$trade",
"as": "el",
"cond": { "$eq": [ "$$el.agencyno", "469" ] }
}
}
}}
*/
])
这里的基本原则是:
having (count(distinct fundcode))...
条件通过$size
和$setDifference
实现“过滤”数组内容。实际上甚至不需要“GROUP BY”部分,因为“数组”已经表示“分组”形式的关系。将整个$redact
声明视为“HAVING”。
如果您的MongoDB真的很古老而且您无法使用这些表单,那么$unwind
仍然可以使用它。这次我们$addToSet
获取“不同”条目:
db.getCollection('collection').aggregate([
{ "$match": {
"shares.lastshares": { "$gt": 1000 },
"trade.agencyno": "469"
}},
{ "$unwind": "$shares" },
{ "$match": {
"shares.lastshares": { "$gt": 1000 },
}},
{ "$group": {
"_id": "$_id",
"shares": { "$addToSet": "$shares.fundcode" },
"trade": { "$first": "$trade" }
}},
{ "$unwind": "$trade" },
{ "$match": {
"trade.agencyno": "469"
}},
{ "$group": {
"_id": "$_id",
"shares": { "$first": "$shares" },
"trade": { "$addToSet": "$trade.fundcode" }
}},
{ "$match": {
"shares.2": { "$exists": true },
"trade.2": { "$exists": true }
}}
])
在这种情况下,“HAVING”由$match
子句表示,其中诸如"shares.2": { "$exists": true }
之类的符号实际上询问被测试的数组是否实际上具有“第三索引”,而意味着它有“大于两个”,这是条件的重点。
如上所述,如果您确实提供了与您要求的条件相符的文档,那么它会帮助您解决问题。遗憾的是,提供的文档未达到文档中"trade"
数组所需的匹配数。
修复您的条件以匹配我们在$gte
条件下2
提供的"trade"
所提供的文档:
db.getCollection('collection').aggregate([
{ "$match": {
"shares.lastshares": { "$gt": 1000 },
"trade.agencyno": "469"
}},
{ "$redact": {
"$cond": {
"if": {
"$and": [
{ "$gt": [
{ "$size": {
"$setDifference": [
{ "$map": {
"input": "$shares",
"as": "el",
"in": {
"$cond": {
"if": { "$gt": [ "$$el.lastshares", 1000 ] },
"then": "$$el.fundcode",
"else": false
}
}
}},
[false]
]
}},
2
]},
{ "$gte": [
{ "$size": {
"$setDifference": [
{ "$map": {
"input": "$trade",
"as": "el",
"in": {
"$cond": {
"if": { "$eq": [ "$$el.agencyno", "469" ] },
"then": "$$el.fundcode",
"else": false
}
}
}},
[false]
]
}},
2
]}
]
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}},
{ "$addFields": {
"shares": {
"$filter": {
"input": "$shares",
"as": "el",
"cond": { "$gt": [ "$$el.lastshares", 1000 ] }
}
},
"trade": {
"$filter": {
"input": "$trade",
"as": "el",
"cond": { "$eq": [ "$$el.agencyno", "469" ] }
}
}
}}
])
该形式的哪些输出为:
{
"_id" : 1.0,
"shares" : [
{
"fundcode" : "000001",
"lastshares" : 1230.2,
"agencyno" : "260",
"netno" : "260"
},
{
"fundcode" : "000002",
"lastshares" : 213124.0,
"agencyno" : "469",
"netno" : "001"
},
{
"fundcode" : "000003",
"lastshares" : 10000.8,
"agencyno" : "469",
"netno" : "002"
}
],
"trade" : [
{
"fundcode" : "000002",
"c_date" : "20160506",
"agencyno" : "469",
"netno" : "001",
"bk_tradetype" : "122",
"confirmbalance" : 213124.0,
"cserialno" : "10110000119602",
"status" : "1"
},
{
"fundcode" : "000003",
"c_date" : "20170507",
"agencyno" : "469",
"netno" : "002",
"bk_tradetype" : "122",
"confirmbalance" : 10000.8,
"netvalue" : 1.0,
"cserialno" : "10110000119602",
"status" : "1"
}
]
}
或者使用$unwind
,放宽长度以测试2
位置:
db.getCollection('collection').aggregate([
{ "$match": {
"shares.lastshares": { "$gt": 1000 },
"trade.agencyno": "469"
}},
{ "$unwind": "$shares" },
{ "$match": {
"shares.lastshares": { "$gt": 1000 },
}},
{ "$group": {
"_id": "$_id",
"shares": { "$addToSet": "$shares.fundcode" },
"trade": { "$first": "$trade" }
}},
{ "$unwind": "$trade" },
{ "$match": {
"trade.agencyno": "469"
}},
{ "$group": {
"_id": "$_id",
"shares": { "$first": "$shares" },
"trade": { "$addToSet": "$trade.fundcode" }
}},
{ "$match": {
"shares.2": { "$exists": true },
"trade.1": { "$exists": true }
}}
])
返回:
{
"_id" : 1.0,
"shares" : [
"000003",
"000002",
"000001"
],
"trade" : [
"000003",
"000002"
]
}
但是当然两者都标识了原始查询要求的条件的“文档”,因此无论返回的内容如何,它都是相同的基本结果。如果必须的话,你可以$project
只考虑_id
。