$ lookup之后的$ group占用时间过长

时间:2018-05-15 15:13:20

标签: mongodb aggregation-framework

我有以下mongo集合:

{
    "_id" : "22pTvYLd7azAAPL5T",
    "plate" : "ABC-123",
    "company": "AMZ",
    "_portfolioType" : "account"
},
{
    "_id" : "22pTvYLd7azAAPL5T",
    "plate" : "ABC-123",
    "_portfolioType" : "sale",
    "price": 87.3
},
{
    "_id" : "22pTvYLd7azAAPL5T",
    "plate" : "ABC-123",
    "_portfolioType" : "sale",
    "price": 88.9
}

我正在尝试聚合plate字段中具有相同值的所有文档。以下是我到目前为止所写的查询:

db.getCollection('temp').aggregate([
{
    $lookup: { 
        from: 'temp',
        let: { 'p': '$plate', 't': '$_portfolioType' },
        pipeline: [{
            '$match': {
                '_portfolioType': 'sale',
                '$expr': { '$and': [ 
                    { '$eq': [ '$plate', '$$p'  ] },
                    { '$eq': [ '$$t', 'account'  ] }
                ]}
            }
        }],
        as: 'revenues' 
    },
},
{
    $project: {
        plate: 1,
        company: 1,
        totalTrades: { $arrayElemAt: ['$revenues', 0] },
    },
},

{
    $addFields: {
        revenue: { $add: [{ $multiply: ['$totalTrades.price', 100] }, 99] },
    },
},

{
    $group: {
        _id: '$company',
        revenue: { $sum: '$revenue' },
    }
}
])

如果我删除$group阶段,查询工作正常,但是,只要我添加$group阶段,mongo就会开始无限处理。我尝试添加$match作为第一阶段,以限制要处理的文档数量,但没有任何运气。 E.g:

{
    $match: { $or: [{ _portfolioType: 'account' }, { _portfolioType: 'sale' }] }
},

我也尝试使用{ explain: true },但它没有返回任何有用的信息。

1 个答案:

答案 0 :(得分:1)

正如Neil Lunn所注意到的,你很可能不需要查找来达到你的“最终目标”,这仍然很模糊。

请阅读评论并根据需要进行调整:

db.temp.aggregate([
    {$group:{
        // Get unique plates
        _id: "$plate",
        // Not clear what you expect if there are documents with
        // different company, and the same plate.
        // Assuming "it never happens"
        // You may need to $cond it here with {$eq: ["$_portfolioType", "account"]}
        // but you never voiced it.         
        company: {$first:"$company"},
        // Not exactly all documents with _portfolioType: sale,
        // but rather price from all documents for this plate.
        // Assuming price field is available only in documents 
        // with "_portfolioType" : "sale". Otherwise add a $cond here.
        // If you really need "all documents", push $$ROOT instead.
        prices: {$push: "$price"}        
    }},
    {$project: {
       company: 1,
       // Apply your math here, or on the previous stage
       // to calculate revenue per plate
       revenue: "$prices" 
    }}
    {$group: {
        // Get document for each "company" 
        _id: "$company",
        // Revenue associated with plate
        revenuePerPlate: {$push: {"k":"$_id", "v":"$revenue"}}        
    }},
    {$project:{         
        _id: 0,
        company: "$_id",
        // Count of unique plate
        platesCnt: {$size: "$revenuePerPlate"},
        // arrayToObject if you wish plate names as properties
        revenuePerPlate: {$arrayToObject: "$revenuePerPlate"}
    }}
])