考虑数据集
{ "_id" : { "$oid" : "aaa" }, "student_id" : 0, "type" : "exam", "score" : 54.6535436362647 }
{ "_id" : { "$oid" : "bbb" }, "student_id" : 0, "type" : "quiz", "score" : 31.95004496742112 }
{ "_id" : { "$oid" : "ccc" }, "student_id" : 0, "type" : "homework", "score" : 14.8504576811645 }
{ "_id" : { "$oid" : "ddd" }, "student_id" : 0, "type" : "homework", "score" : 63.98402553675503 }
{ "_id" : { "$oid" : "eee" }, "student_id" : 1, "type" : "exam", "score" : 74.20010837299897 }
{ "_id" : { "$oid" : "fff" }, "student_id" : 1, "type" : "quiz", "score" : 96.76851542258362 }
{ "_id" : { "$oid" : "ggg" }, "student_id" : 1, "type" : "homework", "score" : 21.33260810416115 }
{ "_id" : { "$oid" : "hhh" }, "student_id" : 1, "type" : "homework", "score" : 44.31667452616328 }
说,对于每个学生,我需要找到最低分数和相应的document_id(_id)。
这是我的管道
pipeline = [
{"$sort":{"student_id":1,"score":1 } },
{"$group": {"_id":"$student_id","mscore":{"$first":"$score"},"docid":{"$first":"$_id"} } },
{"$sort":{"_id":1}},
{"$project":{"docid":1,"_id":0}}
]
虽然这个工作正常但我不确定是否因为我有正确的查询或者是否因为数据的存储方式。
这是我的策略
按student_id排序,得分为
按student_id分组,先得分,它会给student_id,min_score
现在,我还需要doc_id(_id)这个min_score,所以我也首先使用该字段。那是对的吗?
让我们在排序之后说,我需要整个第一份文件,所以我应该首先在每一个领域申请还是有其他方法来做到这一点?
答案 0 :(得分:1)
要在排序后获取整个第一个文档,请在系统变量$first
上应用$$ROOT
运算符,该变量引用根文档,即当前正在中处理的顶级文档$group
运营商管道阶段。你的管道看起来像这样:
var pipeline = [
{
"$sort": { "score": 1 }
},
{
"$group": {
"_id": "$student_id",
"data": { "$first": "$$ROOT" }
}
},
{
"$project": {
"_id": "$data._id",
"student_id": "$data.student_id",
"type": "$data.type",
"lowest_score": "$data.score"
}
}
]