Question

我是mongodb初学者并正在处理作业问题，数据集看起来像这样

.FullName.Replace

作为问题的一部分，我必须为每个学生删除分数最低的“家庭作业”文档。这是我的策略

汇总管道中 1：首先过滤所有类型为：homeworks
的文档 2：按student_id排序，得分
3：在student_id上做一个小组，找到第一个元素

这将为我提供所有得分最低的文件，

但是我如何从原始数据集中删除这些元素？有任何指导或提示吗？

Answer 1

使用聚合中的游标结果使用光标的forEach()方法循环遍历文档，然后使用_id作为remove()方法中的查询从集合中删除每个文档。像这样：

var cursor = db.grades.aggregate(pipeline);
cursor.forEach(function (doc){
    db.grades.remove({"_id": doc._id});
});

另一种方法是使用map()方法创建文档_id的数组，并删除以下文档：

var cursor = db.grades.aggregate(pipeline),
    ids = cursor.map(function (doc) { return doc._id; });
db.grades.remove({"_id": { "$in": ids }});

- 更新 -

对于大型删除操作，将要保留的文档复制到新集合，然后在原始集合上使用drop()可能更有效。要复制基本文档，您的聚合管道需要返回没有最低作业文档的文档，并使用$out运算符作为最终管道阶段将它们复制到另一个集合。请考虑以下聚合管道：

db.grades.aggregate([    
    {
        '$group':{
            '_id': {
                "student_id": "$student_id",
                "type": "$type"
            },
            'lowest_score': { "$min": '$score'},
            'data': {
                '$push': '$$ROOT'
            }
         }
    },    
    {
        "$unwind": "$data"
    },
    {
        "$project": {
            "_id": "$data._id",
            "student_id" : "$data.student_id",
            "type" : "$data.type",
            "score" : "$data.score",
            'lowest_score': 1,            
            "isHomeworkLowest": {
                "$cond": [
                    { 
                        "$and": [
                            { "$eq": [ "$_id.type", "homework" ] },
                            { "$eq": [ "$data.score", "$lowest_score" ] }
                        ] 
                    },
                    true,
                    false
                ]
            }
        }
    },
    {
        "$match": {"isHomeworkLowest" : false}
    },
    {
        "$project": {           
            "student_id": 1,
            "type": 1,
            "score": 1
        }
    },
    {
        "$out": "new_grades"
    }
])

然后您可以在db.grades.drop()之前删除旧收藏，然后在db.new_grades.find()上查询

Answer 2

我认为这是MongoDB大学提供的MongoDB for Java Developers的家庭作业的数据库部分。要求是删除每个学生的最低分。无论如何，我这样解决了。我希望它对你有所帮助。您也可以从我的github链接克隆我的代码（下面提供）

public class Homework2Week2 {

public static void main(String[] args) {
    // TODO Auto-generated method stub
    // Here the the documentation is used for mongo-jva-driver-3.2.2.jar
    /*If you want to use different versionof  mongo-jva-driver 
      then you have look for that version specificatios.*/
    MongoClient mongoClient = new MongoClient();
    // get handle to "students" database
    MongoDatabase database = mongoClient.getDatabase("students");
    // get a handle to the "grades" collection
    MongoCollection<Document> collection = database.getCollection("grades");
    /*
     * Write a program in the language of your choice that will remove the grade of type "homework" with the lowest score for each student from the dataset in the handout. 
     * Since each document is one grade, it should remove one document per student. 
     * This will use the same data set as the last problem, but if you don't have it, you can download and re-import.
     * The dataset contains 4 scores each for 200 students.
     * First, letâs confirm your data is intact; the number of documents should be 800.

     *Hint/spoiler: If you select homework grade-documents, sort by student
      and then by score, you can iterate through and find the lowest score
      for each student by noticing a change in student id. As you notice
      that change of student_id, remove the document.
     */
    MongoCursor<Document> cursor = collection.find(eq("type", "homework")).sort(new Document("student_id", 1).append("score", 1)).iterator();
    int curStudentId = -1;
    try
    {
    while (cursor.hasNext()) {
        Document doc = cursor.next();
        int studentId=(int) doc.get("student_id");
        if (studentId != curStudentId) {
            collection.deleteMany(doc);
            curStudentId = studentId;
        }
    }
    }finally {
        //Close cursor
        cursor.close();
    }   
    //Close mongoClient
    mongoClient.close();
}

}

在我的Github帐户中，我有完整的项目代码。如果有人想要你可以试试这个link。

Answer 3

db.grades.aggregate( [ 
                            { 
                                $match:{type:'homework'}
                            }, 
                            { $group: 
                                 { _id: {student_id:"$student_id",type:'$type'},                                   
                                   score: { $max: "$score" } 
                                 } 
                            } 
                            ]).forEach(function(doc){
db.grades.remove({'student_id':doc._id.student_id,'score':doc.score})

})

Answer 4

从Mongo 4.4开始，$group阶段有了一个新的聚合运算符$accumulator，允许在文档分组时进行自定义累积。

在这种情况下，我们使用了$out阶段，以汇总管道的结果替换了原始集合（已从每个学生的最低分数中剔除）：

// > db.collection.find()
//     { "student_id" : 0, "type" : "exam",     "score" : 54.6535436362647  }
//     { "student_id" : 0, "type" : "homework", "score" : 14.8504576811645  }
//     { "student_id" : 0, "type" : "homework", "score" : 63.98402553675503 }
//     { "student_id" : 1, "type" : "homework", "score" : 21.33260810416115 }
//     { "student_id" : 1, "type" : "homework", "score" : 44.31667452616328 }
db.collection.aggregate(
  { $group: {
      _id: "$student_id",
      docs: { $accumulator: {
        accumulateArgs: ["$$ROOT"],
        init: function() { return []; },
        accumulate: function(docs, doc) { return docs.concat(doc); },
        merge: function(docs1, docs2) { return docs1.concat(docs2); },
        finalize: function(docs) {
          var min = Math.min(...docs.map(x => x.score));
          var i = docs.findIndex((doc) => doc.score == min);
          docs.splice(i, 1);
          return docs;
        },
        lang: "js"
      }}
  }},
  { $unwind: "$docs" },
  { $replaceWith: "$docs" },
  { $out: "collection" }
)
// > db.collection.find()
//     { "student_id" : 0, "type" : "exam",     "score" : 54.6535436362647  }
//     { "student_id" : 0, "type" : "homework", "score" : 63.98402553675503 }
//     { "student_id" : 1, "type" : "homework", "score" : 44.31667452616328 }

此：

$group的文档student_id，并将它们累积为从得分最低的文档中剥离下来的数组：
- accumulateArgs是累积功能使用的字段（或者在我们的情况下为整个文档$$ROOT）的组合。
- 每个原始的累积数组都init被初始化为一个空数组。
- 文档仅经过concat修饰（accumulate和merge）
- 最后，将所有文档分组后，finalize步骤允许找到分数最低的分组文档。
- 在此阶段结束时，流水线文档如下所示：
```
{
  "_id" : 0,
  "docs" : [
    { "student_id" : 0, "type" : "exam", "score" : 54.6535436362647 },
    { "student_id" : 0, "type" : "quiz", "score" : 31.95004496742112 },
    { "student_id" : 0, "type" : "homework", "score" : 63.98402553675503 }
  ]
}
...
```

$unwind s对分组文档的累积字段进行展平，以使分组文档的数组变平，并返回到类似的内容：

{ "_id" : 0, "docs" : { "student_id" : 0, "type" : "exam", "score" : 54.6535436362647 } }
{ "_id" : 0, "docs" : { "student_id" : 0, "type" : "quiz", "score" : 31.95004496742112 } }
{ "_id" : 0, "docs" : { "student_id" : 0, "type" : "homework", "score" : 63.98402553675503 } }
...

$replaceWith每个文档中所有现有字段以及所累积字段的内容，以便找到原始格式。在此阶段结束时，我们将看到以下内容：

{ "student_id" : 0, "type" : "exam", "score" : 54.6535436362647 }
{ "student_id" : 0, "type" : "quiz", "score" : 31.95004496742112 }
{ "student_id" : 0, "type" : "homework", "score" : 63.98402553675503 }
...

$out将聚合管道的结果插入同一集合中。请注意，$out可以方便地替换指定集合的内容，从而使该解决方案成为可能。

Answer 5

int studentId =（int）doc.get（“student_id”）;

给出转换类型错误。可以再次检查吗？

据我所知，我们可以如下所示。

int studentId = Integer.valueOf（doc.get（“student_id”）。toString（））;

Answer 6

此问题是MongoDB大学的M101P：MongoDB开发人员课程的一部分。这里的要求是：-

从数据集中删除每个学生得分最低的“家庭作业”类型的成绩。由于每份文件均为一年级，因此每位学生应删除一份文件。

因此，这意味着每个Student_id中都有4个“类型”，其中两个“类型”是“作业”。我们必须从两个“类型”：“作业”文档中删除最低分数。

pymongo中的启动和运行代码如下：-

URL

端子输出：- $ python remove_grade.py

200

如何删除mongodb中组返回的文件？

6 个答案: