Question

有没有一种方法可以估算查询的费用/查询所读取的数据量，而无需实际运行它？

类似于Google的大查询ItemBatch.objects.values_list('id', flat=True)标志

Answer 1

我不认为目前有这样的功能。但是，您可以在查询上运行explain()，例如db.airbnb.explain().find(....)。查询计划应向您显示包含大小的节点url，例如：

> db.airbnb.explain().find({ "address.market" : "New York", "price": {$lt: NumberDecimal("200.00")} } )
{
  "ok" : 1,
  "plan" : {
    "kind" : "multiPlanNode",
    "regionPlans" : {
      "2/ap-southeast-2" : {
....
        "node" : {
          "kind" : "data",
          "partitions" : [
            {
              "url" : "s3://xxxx/json/airbnb/listingsAndReviews.json?agentRegion=2%2Fap-southeast-2&format=.json&region=ap-southeast-2&size=92.65681457519531+MiB",
              "attributes" : {

              }
            }
....

请注意以下部分：

"url" : "s3://xxxx/json/airbnb/listingsAndReviews.json?agentRegion=2%2Fap-southeast-2&format=.json&region=ap-southeast-2&size=92.65681457519531+MiB"

表示查询将读取该S3 URL，该URL大小为92 MB。

编辑：如@willis所指出的那样，不带任何参数运行explain()会不实际运行查询，但只会显示执行计划（参见explain() behavior）。但是，使用explain('executionStats')，查询实际上将被执行。

估计查询对MongoDB Atlas Data Lake的影响

1 个答案: