Question

我有一些大数据存储在一个文档下，它具有如下大致结构：

{
    "_id": "rPKzOqhVQfCwy2PzRqvyXA",
    "name": "test",
    "raw_data": [
        {},
        ...,
        {}
    ],
    "records": [
        {
            "_id": "xyz_1", // customer generated id
            ...other data
        },
        {
            "_id": "xyz_2", // customer generated id
            ...other data
        },
        {},
        {},
        ...
    ]
}

现在我需要从导入的文件存储文档中的1000条记录，每条记录将具有其自己的ID（以编程方式生成）。用例是，保存此文件后，用户只希望对选定的记录（即ID为xyz_1，xyz_2）进行某些处理。

在同一文档中还可以存储许多其他数据，在上述用例中，我不希望将所有数据都提取出来。

如何查询此文档，以便获得如下输出：

[
    {
        "_id": "xyz_1", // customer generated id
        ...other data
    },
    {
        "_id": "xyz_2", // customer generated id
        ...other data
    }
]

Answer 1

您需要运行$unwind和$replaceRoot：

db.collection.aggregate([
    { $unwind: "$records" },
    { $replaceRoot: { newRoot: "$records" } }
])

Answer 2

按照@mickl的建议，我的解决方案是实现如下输出：

db.collection.aggregate([
    { $unwind: "$records" },
    { $replaceRoot: { newRoot: "$records" } },
    { $match: { "_id": { $in: ["xyz_1", "xyz_2"] } } },
])

更新

我认为上述解决方案将遍历每个文档并替换每个文档中的root，然后进行匹配查询。

我只想从一个父文档中搜索记录，而不是从集合中的所有父文档中搜索记录。我担心的是，它不应该以集合中的其他父文档为目标，所以我最终找到了解决方案，如下所示：

db.collection.aggregate([
    { "$match": { "_id": parent_doc_id } },
    { "$unwind": "$records" },
    { "$match": { "records._id": { "$in": ["xyz_1", "xyz_2"] } } },
    { "$group": { "_id": "$_id", "records": { "$push": "$records" } } },
    { "$limit": 1 },
])

如何从mongodb的嵌套文档中存储的数组中获取特定项目？

2 个答案:

更新