Question

我有三个MongoDB文档：

{ 
    "_id" : ObjectId("571094afc2bcfe430ddd0815"), 
    "name" : "Barry", 
    "surname" : "Allen", 
    "address" : [
        {
            "street" : "Red", 
            "number" : NumberInt(66), 
            "city" : "Central City"
        }, 
        {
            "street" : "Yellow", 
            "number" : NumberInt(7), 
            "city" : "Gotham City"
        }
    ]
}

{ 
    "_id" : ObjectId("57109504c2bcfe430ddd0816"), 
    "name" : "Oliver", 
    "surname" : "Queen", 
    "address" : {
        "street" : "Green", 
        "number" : NumberInt(66), 
        "city" : "Star City"
    }
}
{ 
    "_id" : ObjectId("5710953ac2bcfe430ddd0817"), 
    "name" : "Tudof", 
    "surname" : "Unknown", 
    "address" : "homeless"
}

address字段是第一个文档中的Array个对象，第二个文档中为Object，第三个文档中为String。我的目标是找到我的收藏中包含的字段 address.street 。在这种情况下，正确的计数是1，但我的查询得到两个：

db.coll.find({"address.street":{"$exists":1}}).count()

我也试过map / reduce。它有效，但速度较慢;所以，如果可能的话，我会避免它。

Answer 1

这里的区别是.count()操作在返回字段存在的“文档”计数时实际上是“正确的”。因此，一般考虑因素分解为：

如果您只想使用数组字段

排除文档

然后最有效的方法是将“街道”作为“地址”的属性作为“数组”排除那些文件，然后使用寻找0索引的点符号属性在exlcusion中不存在：

db.coll.find({
  "address.street": { "$exists": true },
  "address.0": { "$exists": false }
}).count()

在$exists这两种情况下，作为原生编码的操作员测试，可以正确地完成正确的工作。

如果您打算计算现场出现次数

如果您实际问的是“字段数”，其中某些“文档”包含数组条目，其中“字段”可能会多次出现。

为此你需要像你提到的聚合框架或mapReduce。 MapReduce使用基于JavaScript的处理，因此比.count()操作慢得多。聚合框架还需要计算并且“将”慢于.count()，但不如mapReduce那么快。

在MongoDB 3.2中，您可以通过扩展$sum处理值数组以及作为分组累加器的能力来获得一些帮助。这里的另一个帮助是$isArray，当数据实际上是“数组”时，它允许通过$map使用不同的处理方法：

db.coll.aggregate([
  { "$group": {
    "_id": null,
    "count": {
      "$sum": {
        "$sum": {
          "$cond": {
            "if": { "$isArray": "$address" },
            "then": {
              "$map": {
                "input": "$address",
                "as": "el",
                "in": {
                  "$cond": {
                    "if": { "$ifNull": [ "$$el.street", false ] },
                    "then": 1,
                    "else": 0
                  }
                }
              }
            },
            "else": {
              "$cond": {
                "if": { "$ifNull": [ "$address.street", false ] },
                "then": 1,
                "else": 0
              }
            }
          }
        }
      }
    }
  }}
])

早期版本依赖于更多的条件处理，以便以不同方式处理数组和非数组数据，并且通常需要$unwind来处理数组条目。

通过$map将数组转换为MongoDB 2.6：

db.coll.aggregate([
  { "$project": {
    "address": {
      "$cond": {
        "if": { "$ifNull": [ "$address.0", false ] },
        "then": "$address",
        "else": {
          "$map": {
            "input": ["A"],
            "as": "el",
            "in": "$address"
          }
        }
      }
    }
  }},
  { "$unwind": "$address" },
  { "$group": {
    "_id": null,
    "count": {
      "$sum": {
        "$cond": {
          "if": { "$ifNull": [ "$address.street", false ] },
          "then": 1,
          "else": 0
        }
      }
    }
  }}
])

或者使用MongoDB 2.2或2.4提供条件选择：

db.coll.aggregate([
  { "$group": {
    "_id": "$_id",
    "address": { 
      "$first": {
        "$cond": [
          { "$ifNull": [ "$address.0", false ] },
          "$address",
          { "$const": [null] }
        ]
      }
    },
    "other": {
      "$push": {
        "$cond": [
          { "$ifNull": [ "$address.0", false ] },
          null,
          "$address"
        ]
      }
    },
    "has": { 
      "$first": {
        "$cond": [
          { "$ifNull": [ "$address.0", false ] },
          1,
          0
        ]
      }
    }
  }},
  { "$unwind": "$address" },
  { "$unwind": "$other" },
  { "$group": {
    "_id": null,
    "count": {
      "$sum": {
        "$cond": [
          { "$eq": [ "$has", 1 ] },
          { "$cond": [
            { "$ifNull": [ "$address.street", false ] },
            1,
            0
          ]},
          { "$cond": [
            { "$ifNull": [ "$other.street", false ] },
            1,
            0
          ]}
        ]
      }
    }
  }}
])

所以后者的形式“应该”比mapReduce好一点，但可能不是很多。

在所有情况下，逻辑都归结为使用$ifNull作为聚合框架的$exists的“逻辑”形式。与$cond配对，当属性实际存在时获得“真实”结果，并且当不存在时返回false值。这决定了1或0是否分别通过$sum返回到整体累积。

理想情况下，您拥有可在单个$group管道阶段执行此操作的现代版本，但您需要更长的路径。

Answer 2

你可以试试这个：

select * from tbl_1st
left join tbl_2nd on tbl_1st.tt = tbl_2nd.Tt
left join tbl_3rd on tbl_1st.TT = tbl_3rd.tt
where tbl_1st.tt = 100000000000000001

在where子句中，我们排除地址是数组和如果它的类型是对象，则包括地址。

计算包含字段的文档数量

2 个答案:

如果您只想使用数组字段

如果您打算计算现场出现次数