更好的选择

Question

我需要为集合中的每个文档总结2018-06-01到2018-06-30的值。每天的每一把钥匙都在＆＃34;天＆＃34;是一个不同的日期和价值。 mongo聚合命令应该是什么样的？结果看起来应该像{ _id：Product_123， June_Sum：值}

Answer 1

对于你现在想要做的那种操作来说，这真的不是一个很好的结构。以这种格式保存数据的重点在于你增加＆＃34;它随你而去。

例如：

 var now = Date.now(),
     today = new Date(now - ( now % ( 1000 * 60 * 60 * 24 ))).toISOString().substr(0,10);

 var product = "Product_123";

 db.counters.updateOne(
   { 
     "month": today.substr(0,7),
     "product": product
   },
   { 
     "$inc": { 
       [`dates.${today}`]: 1,
       "totals": 1
     }
   },
   { "upsert": true }
 )

通过这种方式，$inc的后续更新同时适用于＆＃34;键＆＃34;用于＆＃34; date＆＃34;并且还增加＆＃34;总数＆＃34;匹配文件的财产。因此，经过几次迭代后，您最终会得到类似的结果：

{
        "_id" : ObjectId("5af395c53945a933add62173"),
        "product": "Product_123",
        "month": "2018-05",
        "dates" : {
                "2018-05-10" : 2,
                "2018-05-09" : 1
        },
        "totals" : 3
}

如果您实际上没有这样做，那么您应该＆＃34;应该＆＃34;因为它是这种结构的预期使用模式。

不保留＆＃34;总数＆＃34;或者像存储这些密钥的文档中的条目类型一样，只有＃＆＃34;聚合＆＃34;在处理过程中，要有效地强制执行＆＃34;键＆＃34;进入＆＃34;阵列＆＃34;形式。

带有$ objectToArray的MongoDB 3.6

db.colllection.aggregate([
  // Only consider documents with entries within the range
  { "$match": {
    "$expr": {
      "$anyElementTrue": {
        "$map": {
          "input": { "$objectToArray": "$days" },
          "in": {
            "$and": [
              { "$gte": [ "$$this.k", "2018-06-01" ] },
              { "$lt": [ "$$this.k", "2018-07-01" ] }
            ]
          }
        }
      }
    }
  }},
  // Aggregate for the month 
  { "$group": {
    "_id": "$product",           // <-- or whatever your key for the value is
    "total": {
      "$sum": {
        "$sum": {
          "$map": {
            "input": { "$objectToArray": "$days" },
            "in": {
              "$cond": {
                "if": {
                  "$and": [
                    { "$gte": [ "$$this.k", "2018-06-01" ] },
                    { "$lt": [ "$$this.k", "2018-07-01" ] }
                  ]
                },
                "then": "$$this.v",
                "else": 0
              }
            }
          }
        }
      }
    }
  }}
])

其他版本的mapReduce

db.collection.mapReduce(
  // Taking the same presumption on your un-named key for "product"
  function() {
    Object.keys(this.days)
      .filter( k => k >= "2018-06-01" && k < "2018-07-01")
      .forEach(k => emit(this.product, this.days[k]));
  },
  function(key,values) {
    return Array.sum(values);
  },
  {
    "out": { "inline": 1 },
    "query": {
      "$where": function() {
        return Object.keys(this.days).some(k => k >= "2018-06-01" && k < "2018-07-01")
      }
    }
  }
)

两者都非常糟糕，因为你需要计算＆＃34;键＆＃34;甚至在选择文件时仍然在要求的范围内，甚至然后仍然过滤这些文件中的键，以决定是否为它积累。

此处还注意到，如果您的"Product_123'也是＆＃34;密钥的名称＆＃34;在文档中而不是＆＃34;值＆＃34;，那么你就会进行更多的体操＆＃34;体操＆＃34;简单地转换那个＆＃34; key＆＃34;进入＆＃34;价值＆＃34;形式，数据库是如何做事的，以及这里不必要的强制行为的全部要点。

更好的选择

与最初显示的处理相反，你应该＆＃34;应该＆＃34;积累＆＃34;当你去＆＃34;每次写入手头的文件，比需要处理＆＃34;更好的选择。为了强制转换为数组格式，首先只需将数据放入数组中：

{
        "_id" : ObjectId("5af395c53945a933add62173"),
        "product": "Product_123",
        "month": "2018-05",
        "dates" : [
          { "day": "2018-05-09", "value": 1 },
          { "day": "2018-05-10", "value": 2 }
        },
        "totals" : 3
}

对于查询和进一步分析而言，这些无限好：

db.counters.aggregate([
  { "$match": {
    // "month": "2018-05"    // <-- or really just that, since it's there
    "dates": {
      "day": {
        "$elemMatch": {
          "$gte": "2018-05-01", "$lt": "2018-06-01"
        }
      }
    }
  }},
  { "$group": {
    "_id": null,
    "total": {
      "$sum": {
        "$sum": {
          "$filter": {
            "input": "$dates",
            "cond": {
              "$and": [
                { "$gte": [ "$$this.day", "2018-05-01" ] },
                { "$lt": [ "$$this.day", "2018-06-01" ] }
              ]
            }
          }
        }
      }
    }
  }}
])

这当然是非常有效的，并且故意避免仅用于演示的"total"字段。但当然，你要保持＆＃34;运行积累＆＃34;写作：

db.counters.updateOne(
   { "product": product, "month": today.substr(0,7)}, "dates.day": today },
   { "$inc": { "dates.$.value": 1, "total": 1 } }
)

这很简单。添加upserts会增加一个＆＃34; little＆＃34;更复杂：

// A "batch" of operations with bulkWrite
db.counter.bulkWrite([
  // Incrementing the matched element
  { "udpdateOne": {
    "filter": {
      "product": product,
      "month": today.substr(0,7)},
      "dates.day": today 
    },
    "update": {
      "$inc": { "dates.$.value": 1, "total": 1 }
    }
  }},
  // Pushing a new "un-matched" element
  { "updateOne": {
    "filter": {
      "product": product,
      "month": today.substr(0,7)},
      "dates.day": { "$ne": today }
    },
    "update": {
      "$push": { "dates": { "day": today, "value": 1 } },
      "$inc": { "total": 1 }
    }
  }},
  // "Upserting" a new document were not matched
  { "updateOne": {
    "filter": {
      "product": product,
      "month": today.substr(0,7)},
    },
    "update": {
      "$setOnInsert": {
        "dates": [{ "day": today, "value": 1 }],
        "total": 1
      }
    },
    "upsert": true
  }}
])

但通常你会得到两个世界中最好的＆＃34;通过简单的积累＆＃34;当你去＆＃34;以及稍后查询和进行其他分析的简单有效的方法。

故事的整体道德是选择正确的结构＆＃34;为了你真正想做的事。不要把东西放进＆＃34;钥匙＆＃34;这显然是用作＆＃34;值＆＃34;，因为它是一种反模式，只会增加其他目的的复杂性和低效率，即使它似乎适用于＆＃34 ;单＆＃34;最初以这种方式存储它的目的。

注意也没有真正提倡存储＆＃34;字符串＆＃34;为＆＃34;约会＆＃34;在这里任何方式。如上所述，更好的方法是使用＆＃34;值＆＃34;你真正的意思是什么＆＃34;价值观＆＃34;你打算用。将日期数据存储为＆＃34;值＆＃34; 始终存储为BSON日期更有效率和实用性，而不是＆＃34;字符串＆＃34;。

如何在MongoDB的嵌套日期范围内对值进行求和

1 个答案:

带有$ objectToArray的MongoDB 3.6

其他版本的mapReduce

更好的选择