假设我有这两个巨大的文件:
[
{
_id: ....,
status: "A",
class: "DIP1A",
"created.user._id": ...,
"created.dt": ....,
"category": "private",
price: 100.00 //type double
},
{
_id: ....,
status: "A",
class: "DIP2A",
"created.user._id": ...
"created.dt": ...,
"category": "public",
price: 200.00 //type double
},
];
查询:
var pipeline = [
{
$match: {
"created.user._id": ....
}
},
{
$unwind: "$class"
},
{
$unwind: "$price"
},
{
$group: {
_id: "$class",
price: {
$sum: "$price"
},
count: {
$sum: 1
}
}
},
{
$project: {
_id: 0,
class: '$_id',
count: 1,
price: 1
}
}
];
db.myCollection.aggregate(pipeline);
问题:
索引:
db.myCollection.ensureIndex({ 'created.user._id': -1 });
db.myCollection.ensureIndex({ 'created.user._id': -1, class: 1 });
db.myCollection.ensureIndex({ 'created.user._id': -1, price: 1});
性能:
答案 0 :(得分:0)
你真正应该做的一件事是将$ project阶段移到$ match阶段之后(如果文档包含更多数据,然后在你的问题中说明(大文档))。 您希望通过管道尽可能少的数据。 此外,我看到价格和课程的价格放松,但在你的例子中,他们不是阵列。它可能是复制/粘贴问题; - )
喜欢:
var pipeline = [
{
$match: {
"created.user._id": ....
}
},
{
$project: {
_id: 0,
class: '$_id',
count: 1,
price: 1
}
},
{
$unwind: "$class"
},
{
$unwind: "$price"
},
{
$group: {
_id: "$class",
price: {
$sum: "$price"
},
count: {
$sum: 1
}
}
},
];