Mongodb查询从集合中挑选随机文档直到达到数量

时间:2018-02-04 16:17:27

标签: mongodb

我有一个包含这样文件的集合:

歌曲

{
  "title": "Highway To Hell",
  "duration": 4
  "tags": [{"label":"rock"},{"label":"heavy"}]
}
{
  "title": "Never Be The Same",
  "duration": 3
  "tags": [{"label":"pop"}]
}
{
  "title": "Wake Up",
  "duration": 4
  "tags": [{"label":"metal"},{"label":"heavy"}]
}
{
  "title": "Catharsis",
  "duration": 3
  "tags": [{"label":"metal"},{"label":"heavy"}]
}

我希望查询随机歌曲,持续时间最长为8分钟,标签为#34;重"

查询结果1

{
  "title": "Highway To Hell",
  "duration": 4
  "tags": [{"label":"rock"},{"label":"heavy"}]
}
{
  "title": "Wake Up",
  "duration": 4
  "tags": [{"label":"metal"},{"label":"heavy"}]
}

查询结果2

{
  "title": "Highway To Hell",
  "duration": 4
  "tags": [{"label":"rock"},{"label":"heavy"}]
}
{
  "title": "Catharsis",
  "duration": 3
  "tags": [{"label":"metal"},{"label":"heavy"}]
}

我已经查看了$ sample运算符,但您需要随机文档的数量,但在这种情况下,该数字不是固定的。并且需要一个变量来保持"持续时间"但我不确定如何做到这一点。

1 个答案:

答案 0 :(得分:1)

你可以尝试这种聚合

  1. $match - 仅匹配标签包含重
  2. $sample - 获取n个样本(不保证元素的顺序)
  3. $group - 按_id分组,以便在下一阶段进行缩减
  4. $project - 使用$ reduce计算每首歌曲的持续时间和最长持续时间并将其相加,如果未达到最大持续时间,则添加歌曲到阵列,否则忽略它
  5. $unwind - 展开已过滤的歌曲
  6. $replaceRoot - 获取原始文档结构中的歌曲
  7. 这里假设歌曲的持续时间最短为1分钟,如果不增加样本数

    管道

        db.songs.aggregate(
          [
            {$match : {"tags.label" : "heavy"}},
            {$sample : {size : 8}},
            {$group : {_id : null, songs : {$push : "$$ROOT"}}},
            {$project : {
              songs : {
                $reduce : {
                    input : "$songs",
                    initialValue : { filtered : [], total : 0 },
                    in : {              
                      total : {$add : ["$$value.total",{$cond :[{$lte : [{$add : ["$$value.total", "$$this.duration"]}, 8]},"$$this.duration", 0]}]},
                      filtered : {$concatArrays : [ "$$value.filtered", {$cond :[{$lte : [{$add : ["$$value.total", "$$this.duration"]}, 8]},["$$this"], []]}]}
                    }
                  }
                }
              }
            },
            {$unwind : "$songs.filtered"},
            {$replaceRoot : {newRoot : "$songs.filtered"}}
          ]
        ).pretty()
    

    为问题中的测试数据生成的可能输出

    > db.songs.aggregate(   [     {$match : {"tags.label" : "heavy"}},     {$sample : {size : 8}},     {$group : {_id : null, songs : {$push : "$$ROOT"}}},     {$project : {       songs : {         $reduce : {             input : "$songs",             initialValue : { filtered : [], total : 0 },             in : {                             total : {$add : ["$$value.total",{$cond :[{$lte : [{$add : ["$$value.total", "$$this.duration"]}, 8]},"$$this.duration", 0]}]},               filtered : {$concatArrays : [ "$$value.filtered", {$cond :[{$lte : [{$add : ["$$value.total", "$$this.duration"]}, 8]},["$$this"], []]}]}             }           }         }       }     },     {$unwind : "$songs.filtered"},     {$replaceRoot : {newRoot : "$songs.filtered"}}   ] ).pretty()
    {
        "_id" : ObjectId("5a7739693eafa689da47ba81"),
        "title" : "Wake Up",
        "duration" : 4,
        "tags" : [
            {
                "label" : "metal"
            },
            {
                "label" : "heavy"
            }
        ]
    }
    {
        "_id" : ObjectId("5a7739693eafa689da47ba7f"),
        "title" : "Highway To Hell",
        "duration" : 4,
        "tags" : [
            {
                "label" : "rock"
            },
            {
                "label" : "heavy"
            }
        ]
    }
    > 
    > db.songs.aggregate(   [     {$match : {"tags.label" : "heavy"}},     {$sample : {size : 8}},     {$group : {_id : null, songs : {$push : "$$ROOT"}}},     {$project : {       songs : {         $reduce : {             input : "$songs",             initialValue : { filtered : [], total : 0 },             in : {                             total : {$add : ["$$value.total",{$cond :[{$lte : [{$add : ["$$value.total", "$$this.duration"]}, 8]},"$$this.duration", 0]}]},               filtered : {$concatArrays : [ "$$value.filtered", {$cond :[{$lte : [{$add : ["$$value.total", "$$this.duration"]}, 8]},["$$this"], []]}]}             }           }         }       }     },     {$unwind : "$songs.filtered"},     {$replaceRoot : {newRoot : "$songs.filtered"}}   ] ).pretty()
    {
        "_id" : ObjectId("5a7739693eafa689da47ba7f"),
        "title" : "Highway To Hell",
        "duration" : 4,
        "tags" : [
            {
                "label" : "rock"
            },
            {
                "label" : "heavy"
            }
        ]
    }
    {
        "_id" : ObjectId("5a7739693eafa689da47ba82"),
        "title" : "Catharsis",
        "duration" : 3,
        "tags" : [
            {
                "label" : "metal"
            },
            {
                "label" : "heavy"
            }
        ]
    }
    > db.songs.aggregate(   [     {$match : {"tags.label" : "heavy"}},     {$sample : {size : 8}},     {$group : {_id : null, songs : {$push : "$$ROOT"}}},     {$project : {       songs : {         $reduce : {             input : "$songs",             initialValue : { filtered : [], total : 0 },             in : {                             total : {$add : ["$$value.total",{$cond :[{$lte : [{$add : ["$$value.total", "$$this.duration"]}, 8]},"$$this.duration", 0]}]},               filtered : {$concatArrays : [ "$$value.filtered", {$cond :[{$lte : [{$add : ["$$value.total", "$$this.duration"]}, 8]},["$$this"], []]}]}             }           }         }       }     },     {$unwind : "$songs.filtered"},     {$replaceRoot : {newRoot : "$songs.filtered"}}   ] ).pretty()
    {
        "_id" : ObjectId("5a7739693eafa689da47ba82"),
        "title" : "Catharsis",
        "duration" : 3,
        "tags" : [
            {
                "label" : "metal"
            },
            {
                "label" : "heavy"
            }
        ]
    }
    {
        "_id" : ObjectId("5a7739693eafa689da47ba81"),
        "title" : "Wake Up",
        "duration" : 4,
        "tags" : [
            {
                "label" : "metal"
            },
            {
                "label" : "heavy"
            }
        ]
    }
    >