获取计数和文档详细信息

时间:2019-05-19 07:35:57

标签: mongodb aggregation-framework

我有一个这样的收藏:

[{
    "_id": "5ba6b67ab22f62939eba24cc",
    "voucher": "77-SRNP-4",
    "Collection Date": "1977-06-06T06:00:00.000Z",
    "Herbivore species": "Agrius cingulata",
    "Herbivore subfamily": "Sphinginae",
    "Latitude": "10.83764",
    "Longitude": "-85.61871"
}, {
    "_id": "5ba6b67ab22f62939eba24ea",
    "voucher": "78-SRNP-10",
    "Collection Date": "1978-05-20T06:00:00.000Z",
    "Herbivore species": "Xylophanes turbata",
    "Herbivore subfamily": "Macroglossinae",
    "Latitude": "10.80212",
    "Longitude": "-85.65372"
}, {
    "_id": "5ba6b67ab22f62939eba24eb",
    "voucher": "78-SRNP-10.02",
    "Collection Date": "1978-05-20T06:00:00.000Z",
    "Herbivore species": "Xylophanes turbata",
    "Herbivore subfamily": "Macroglossinae",
    "Latitude": "10.80212",
    "Longitude": "-85.65372"
}]

我想在单个查询中获得物种计数以及每条记录的一些详细信息。类似于$unwind的对立面。要获得类似的内容:

[{
    "Agrius cingulata": {
        count: 1,
        "Herbivore subfamily": "Sphinginae"
        records: [{
            "voucher": "77-SRNP-4",
            "Collection Date": "1977-06-06T06:00:00.000Z",
            "Latitude": "10.83764",
            "Longitude": "-85.61871"
        }]

    },
    "Xylophanes turbata": {
        count: 2,
        "Herbivore subfamily": "Macroglossinae",
        records: [
            {
                "voucher": "78-SRNP-10",
                "Collection Date": "1978-05-20T06:00:00.000Z",
                "Latitude": "10.80212",
                "Longitude": "-85.65372"
            },
            {
                "voucher": "78-SRNP-10.02",
                "Collection Date": "1978-05-20T06:00:00.000Z",
                "Latitude": "10.80212",
                "Longitude": "-85.65372"
            }
        ]
    }
}]

我目前正在处理两个单独的查询,一个用于查找记录,另一个用于计数。但是有效载荷有点大,我认为如果只发送一次重复的信息(如物种亚科),并且将计数和其他统计信息捆绑在一起,我可以减少它,但是我没有找到合适的聚合方法。

谢谢!

2 个答案:

答案 0 :(得分:0)

尝试一下:

db.collection.aggregate([
    {
        $group: {
            _id: "$Herbivore species",
            records: { $push: { Longitude: "$Longitude", Latitude: "$Latitude", "Collection Date": "$Collection Date", voucher: "$voucher" } },
            count : { $sum :1}
        }
    }
])

结果:

{
    "_id" : "Xylophanes turbata",
    "records" : [
        {
            "Longitude" : "-85.65372",
            "Latitude" : "10.80212",
            "Collection Date" : "1978-05-20T06:00:00.000Z",
            "voucher" : "78-SRNP-10"
        },
        {
            "Longitude" : "-85.65372",
            "Latitude" : "10.80212",
            "Collection Date" : "1978-05-20T06:00:00.000Z",
            "voucher" : "78-SRNP-10.02"
        }
    ],
    "count" : 2
},

/* 2 */
{
    "_id" : "Agrius cingulata",
    "records" : [
        {
            "Longitude" : "-85.61871",
            "Latitude" : "10.83764",
            "Collection Date" : "1977-06-06T06:00:00.000Z",
            "voucher" : "77-SRNP-4"
        }
    ],
    "count" : 1
}

要对“草食动物种”和“草食动物亚科”进行分组,您可以尝试如下操作:

db.collection.aggregate([
    {
        $group: {
            _id: { "Herbivore species" :"$Herbivore species" , "Herbivore subfamily": "$Herbivore subfamily" },
            records: { $push: { Longitude: "$Longitude", Latitude: "$Latitude", "Collection Date": "$Collection Date", voucher: "$voucher" } },
            count : { $sum :1}
        }
    }
])

结果如下:

{
    "_id" : {
        "Herbivore species" : "Xylophanes turbata",
        "Herbivore subfamily" : "Macroglossinae"
    },
    "records" : [
        {
            "Longitude" : "-85.65372",
            "Latitude" : "10.80212",
            "Collection Date" : "1978-05-20T06:00:00.000Z",
            "voucher" : "78-SRNP-10"
        },
        {
            "Longitude" : "-85.65372",
            "Latitude" : "10.80212",
            "Collection Date" : "1978-05-20T06:00:00.000Z",
            "voucher" : "78-SRNP-10.02"
        }
    ],
    "count" : 2
},

/* 2 */
{
    "_id" : {
        "Herbivore species" : "Agrius cingulata",
        "Herbivore subfamily" : "Sphinginae"
    },
    "records" : [
        {
            "Longitude" : "-85.61871",
            "Latitude" : "10.83764",
            "Collection Date" : "1977-06-06T06:00:00.000Z",
            "voucher" : "77-SRNP-4"
        }
    ],
    "count" : 1
}

答案 1 :(得分:0)

$unwind的反向” 的基本概念当然是$push。因此,基本上就是您要做的,在适当的地方另外使用$first$arrayToObject$objectToArray以及$filter,因为不需要指定文档中的每个字段,尤其是您实际上在文档中的字段比问题中显示的要多的地方。

以下内容相当通用,并不关心文档中还有多少其他字段:

db.collection.aggregate([
  { "$group": {
    "_id": "$Herbivore species",
    "count": { "$sum": 1 },
    "Herbivore subfamily": { "$first": "$Herbivore subfamily" },
    "records": {
      "$push": {
        "$arrayToObject": {
          "$filter": {
            "input": { "$objectToArray": "$$ROOT" },
            "cond": { "$not": { "$in": [ "$$this.k", ["Herbivore subfamily", "Herbivore species"] ] } }
          }
        }
      }
    }
  }}
])

这将产生如下结果:

{
        "_id" : "Agrius cingulata",
        "count" : 1,
        "Herbivore subfamily" : "Sphinginae",
        "records" : [
                {
                        "_id" : "5ba6b67ab22f62939eba24cc",
                        "voucher" : "77-SRNP-4",
                        "Collection Date" : "1977-06-06T06:00:00.000Z",
                        "Latitude" : "10.83764",
                        "Longitude" : "-85.61871"
                }
        ]
}
{
        "_id" : "Xylophanes turbata",
        "count" : 2,
        "Herbivore subfamily" : "Macroglossinae",
        "records" : [
                {
                        "_id" : "5ba6b67ab22f62939eba24ea",
                        "voucher" : "78-SRNP-10",
                        "Collection Date" : "1978-05-20T06:00:00.000Z",
                        "Latitude" : "10.80212",
                        "Longitude" : "-85.65372"
                },
                {
                        "_id" : "5ba6b67ab22f62939eba24eb",
                        "voucher" : "78-SRNP-10.02",
                        "Collection Date" : "1978-05-20T06:00:00.000Z",
                        "Latitude" : "10.80212",
                        "Longitude" : "-85.65372"
                }
        ]
}

不完全是问题中要的内容,因为它当然不能完全以要求的方式显示结果的“关键”。但这可以通过第二个$group阶段以及之前显示的相同运算符进行修改:

db.collection.aggregate([
  { "$group": {
    "_id": "$Herbivore species",
    "count": { "$sum": 1 },
    "Herbivore subfamily": { "$first": "$Herbivore subfamily" },
    "records": {
      "$push": {
        "$arrayToObject": {
          "$filter": {
            "input": { "$objectToArray": "$$ROOT" },
            "cond": { "$not": { "$in": [ "$$this.k", ["Herbivore subfamily", "Herbivore species"] ] } }
          }
        }
      }
    }
  }},
  { "$group": {
    "_id": null,
    "content": {
      "$mergeObjects": {
        "$arrayToObject": [[
          { "k": "$_id",
            "v": {
              "$arrayToObject": {
                "$filter": {
                  "input": { "$objectToArray": "$$ROOT" },
                  "cond": { "$ne": ["$$this.k", "_id"] }
                }
              }
            }
          }
        ]]
      }
    }
  }},
  { "$replaceRoot": { "newRoot": "$content" } }
])

哪个返回:

{
        "Xylophanes turbata" : {
                "count" : 2,
                "Herbivore subfamily" : "Macroglossinae",
                "records" : [
                        {
                                "_id" : "5ba6b67ab22f62939eba24ea",
                                "voucher" : "78-SRNP-10",
                                "Collection Date" : "1978-05-20T06:00:00.000Z",
                                "Latitude" : "10.80212",
                                "Longitude" : "-85.65372"
                        },
                        {
                                "_id" : "5ba6b67ab22f62939eba24eb",
                                "voucher" : "78-SRNP-10.02",
                                "Collection Date" : "1978-05-20T06:00:00.000Z",
                                "Latitude" : "10.80212",
                                "Longitude" : "-85.65372"
                        }
                ]
        },
        "Agrius cingulata" : {
                "count" : 1,
                "Herbivore subfamily" : "Sphinginae",
                "records" : [
                        {
                                "_id" : "5ba6b67ab22f62939eba24cc",
                                "voucher" : "77-SRNP-4",
                                "Collection Date" : "1977-06-06T06:00:00.000Z",
                                "Latitude" : "10.83764",
                                "Longitude" : "-85.61871"
                        }
                ]
        }
}

或者,如果您愿意(因为它无论如何也不会改变返回的数据量),则可以在从MongoDB返回结果之后,简单地“简化”为客户端代码中返回文档的“键/值”形式。一个简单的JavaScript“ shell”示例:

db.collection.aggregate([
  { "$group": {
    "_id": "$Herbivore species",
    "count": { "$sum": 1 },
    "Herbivore subfamily": { "$first": "$Herbivore subfamily" },
    "records": {
      "$push": {
        "$arrayToObject": {
          "$filter": {
            "input": { "$objectToArray": "$$ROOT" },
            "cond": { "$not": { "$in": [ "$$this.k", ["Herbivore subfamily", "Herbivore species"] ] } }
          }
        }
      }
    }
  }},
  /*
  { "$replaceRoot": {
    "newRoot": {
      "$arrayToObject": [[
        { "k": "$_id",
          "v": {
            "$arrayToObject": {
              "$filter": {
                "input": { "$objectToArray": "$$ROOT" },
                "cond": { "$ne": ["$$this.k", "_id"] }
              }
            }
          }
        }
      ]]
    }
  }}
  */
]).toArray().reduce((o,{ _id, ...rest }) => ({ ...o, [_id]: rest }),{})

结果相同:

{
        "Xylophanes turbata" : {
                "count" : 2,
                "Herbivore subfamily" : "Macroglossinae",
                "records" : [
                        {
                                "_id" : "5ba6b67ab22f62939eba24ea",
                                "voucher" : "78-SRNP-10",
                                "Collection Date" : "1978-05-20T06:00:00.000Z",
                                "Latitude" : "10.80212",
                                "Longitude" : "-85.65372"
                        },
                        {
                                "_id" : "5ba6b67ab22f62939eba24eb",
                                "voucher" : "78-SRNP-10.02",
                                "Collection Date" : "1978-05-20T06:00:00.000Z",
                                "Latitude" : "10.80212",
                                "Longitude" : "-85.65372"
                        }
                ]
        },
        "Agrius cingulata" : {
                "count" : 1,
                "Herbivore subfamily" : "Sphinginae",
                "records" : [
                        {
                                "_id" : "5ba6b67ab22f62939eba24cc",
                                "voucher" : "77-SRNP-4",
                                "Collection Date" : "1977-06-06T06:00:00.000Z",
                                "Latitude" : "10.83764",
                                "Longitude" : "-85.61871"
                        }
                ]
        }
}

可能要掌握的主要事情是“聚合”(这是 first 提出的阶段所要做的),实际上是您实际上希望数据库服务器执行的操作。您可以使用“ fancy” 运算符或 not 运算符,具体取决于可用的MongoDB版本。但是,在以下示例中,作为第二阶段所演示的“最终结果转换” 可能是您真正想要在接受客户端并处理代码中执行的操作。这样做通常更直接,更简单。

相关问题