按嵌入数组的文档字段分组,然后按父文件的字段分组

时间:2018-02-01 16:01:36

标签: mongodb aggregation-framework mongodb-3.6

我不确定如何表达这一点,但基本上我想通过子数组中的字段对文档进行分组,然后我想按父(根)文档中的字段进行分组,但保留先前的分组。

我希望有一个例子可以帮到这里。

我们说我有这些文件,其中有几个custItemNum的信息几乎按originalFile分组:

[
    {
        "items" : [ 
            {
                "recType" : "I2",
                "qty" : 2.0,
                "custItemNum" : 10.0
            }, 
            {
                "recType" : "I2",
                "qty" : 200.0,
                "custItemNum" : 20.0
            }, 
            {
                "recType" : "I2",
                "qty" : 50.0,
                "custItemNum" : 30.0
            }, 
            {
                "recType" : "D9",
                "custItemNum" : 10.0
            }, 
            {
                "recType" : "D9",
                "custItemNum" : 20.0
            }, 
            {
                "recType" : "D9",
                "custItemNum" : 30.0
            }
        ],
        "originalFile" : "727451921.txt",
        "docId" : "278791399"
    },
    {
        "items" : [ 
            {
                "recType" : "I2",
                "qty" : 180.0,
                "custItemNum" : 20.0
            }
        ],
        "originalFile" : "727557371.txt",
        "docId" : "278791399"
    },
    {
        "items" : [ 
            {
                "recType" : "I2",
                "qty" : 10.0,
                "custItemNum" : 30.0
            }
        ],
        "originalFile" : "727557371.txt",
        "docId" : "278791399"
    },
    {
        "items" : [ 
            {
                "recType" : "I2",
                "qty" : 10.0,
                "custItemNum" : 30.0
            }
        ],
        "originalFile" : "727557371.txt",
        "docId" : "278791399"
    }
]

我希望最终得到这样的集合,其中第一个分组是custItemNumber,然后是originalFile

[
    {
        "custItemNumber" : 10.0,
        "count" : 2.0,
        "itemInfo" : [ 
            {
                "originalFile" : "727451921.txt",
                "item" : [ 
                    {
                        "recType" : "I2",
                        "qty" : 2.0,
                        "custItemNum" : 10.0
                    }, 
                    {
                        "recType" : "D9",
                        "custItemNum" : 10.0
                    }
                ]
            }
        ]
    },
    {
        "custItemNumber" : 20.0,
        "count" : 3.0,
        "itemInfo" : [ 
            {
                "originalFile" : "727451921.txt",
                "item" : [ 
                    {
                        "recType" : "I2",
                        "qty" : 200.0,
                        "custItemNum" : 20.0
                    }, 
                    {
                        "recType" : "D9",
                        "custItemNum" : 20.0
                    }
                ]
            }, 
            {
                "originalFile" : "727557371.txt",
                "item" : [ 
                    {
                        "recType" : "I2",
                        "qty" : 180.0,
                        "custItemNum" : 20.0
                    }
                ]
            }
        ]
    },
    {
        "custItemNumber" : 30.0,
        "count" : 4.0,
        "itemInfo" : [ 
            {
                "originalFile" : "727451921.txt",
                "item" : [ 
                    {
                        "recType" : "I2",
                        "qty" : 50.0,
                        "custItemNum" : 30.0
                    }, 
                    {
                        "recType" : "D9",
                        "custItemNum" : 30.0
                    }
                ]
            }, 
            {
                "originalFile" : "727557371.txt",
                "item" : [ 
                    {
                        "recType" : "I2",
                        "qty" : 10.0,
                        "custItemNum" : 30.0
                    }, 
                    {
                        "recType" : "I2",
                        "qty" : 10.0,
                        "custItemNum" : 30.0
                    }
                ]
            }
        ]
    }
]

请记住,这些文档已经来自多个聚合步骤,因此没有可用的_id字段。

到目前为止,我想出了这些聚合阶段(我手动编辑了它的输出以获得上面的结果):

{$unwind: "$items"},
{$bucket: {
    groupBy: "$items.custItemNum",
    boundaries: [10, 20, 30, 40, 50, 60, 70, 80, 90, 100],
    output: {
        count: {$sum: 1},
        itemInfo: {$push: "$$ROOT"}
    }
 }}

导致这个结果:

[
    {
        "_id" : 10.0,
        "count" : 2.0,
        "itemInfo" : [ 
            {
                "_id" : ObjectId("5a7336ebb4b169272dae528f"),
                "items" : {
                    "recType" : "I2",
                    "qty" : 2.0,
                    "custItemNum" : 10.0
                },
                "originalFile" : "727451921.txt",
                "docId" : "278791399"
            }, 
            {
                "_id" : ObjectId("5a7336ebb4b169272dae528f"),
                "items" : {
                    "recType" : "D9",
                    "custItemNum" : 10.0
                },
                "originalFile" : "727451921.txt",
                "docId" : "278791399"
            }
        ]
    },
    {
        "_id" : 20.0,
        "count" : 3.0,
        "itemInfo" : [ 
            {
                "_id" : ObjectId("5a7336ebb4b169272dae528f"),
                "items" : {
                    "recType" : "I2",
                    "qty" : 200.0,
                    "custItemNum" : 20.0
                },
                "originalFile" : "727451921.txt",
                "docId" : "278791399"
            }, 
            {
                "_id" : ObjectId("5a7336ebb4b169272dae528f"),
                "items" : {
                    "recType" : "D9",
                    "custItemNum" : 20.0
                },
                "originalFile" : "727451921.txt",
                "docId" : "278791399"
            }, 
            {
                "_id" : ObjectId("5a7336ebb4b169272dae5290"),
                "items" : {
                    "recType" : "I2",
                    "qty" : 180.0,
                    "custItemNum" : 20.0
                },
                "originalFile" : "727557371.txt",
                "docId" : "278791399"
            }
        ]
    },
    {
        "_id" : 30.0,
        "count" : 4.0,
        "itemInfo" : [ 
            {
                "_id" : ObjectId("5a7336ebb4b169272dae528f"),
                "items" : {
                    "recType" : "I2",
                    "qty" : 50.0,
                    "custItemNum" : 30.0
                },
                "originalFile" : "727451921.txt",
                "docId" : "278791399"
            }, 
            {
                "_id" : ObjectId("5a7336ebb4b169272dae528f"),
                "items" : {
                    "recType" : "D9",
                    "custItemNum" : 30.0
                },
                "originalFile" : "727451921.txt",
                "docId" : "278791399"
            }, 
            {
                "_id" : ObjectId("5a7336ebb4b169272dae5291"),
                "items" : {
                    "recType" : "I2",
                    "qty" : 10.0,
                    "custItemNum" : 30.0
                },
                "originalFile" : "727557371.txt",
                "docId" : "278791399"
            }, 
            {
                "_id" : ObjectId("5a7336ebb4b169272dae5292"),
                "items" : {
                    "recType" : "I2",
                    "qty" : 10.0,
                    "custItemNum" : 30.0
                },
                "originalFile" : "727557371.txt",
                "docId" : "278791399"
            }
        ]
    }
]

我被困在这里,想到的任何其他步骤(即$replaceRoot : { newRoot: "$itemInfo" })都会破坏外部分组。

另外,custItemNum值是动态的,但AFAICT boundaries阶段的$bucket字段采用常量数组,因此如果有一种传递计算数组的方法在那里,我想知道如何。

0 个答案:

没有答案