如何从数组中删除重复的条目?

时间:2012-03-25 17:18:07

标签: mongodb

如何从数组中删除重复的条目?

在下面的示例中,“C ++中的算法”被添加两次。

$ unset修饰符会删除特定字段但是如何从字段中删除条目?

> db.users.find()

{ "_id" : ObjectId("4f6cd3c47156522f4f45b26f"), 
 "favorites" : { "books" : [ "Algorithms in C++",    
                            "The Art of Computer Programmning", 
                            "Graph Theory",      
                            "Algorithms in C++" ] }, 
  "name" : "robert" }

5 个答案:

答案 0 :(得分:32)

从MongoDB 2.2开始,您可以将aggregation framework$unwind$group$project阶段一起使用来实现此目的:

db.users.aggregate([{$unwind: '$favorites.books'},
                    {$group: {_id: '$_id',
                              books: {$addToSet: '$favorites.books'},
                              name: {$first: '$name'}}},
                    {$project: {'favorites.books': '$books', name: '$name'}}
                   ])

请注意$project需要重命名favorites字段,因为$group聚合字段无法嵌套。

答案 1 :(得分:4)

您需要做的是使用map reduce来检测和计算重复的标签..然后使用$set替换基于{ "_id" : ObjectId("4f6cd3c47156522f4f45b26f"),的整本书

这已经在这里讨论了很多时间..请参见

Removing duplicate records using MapReduce

Fast way to find duplicates on indexed column in mongodb

http://csanz.posterous.com/look-for-duplicates-using-mongodb-mapreduce

http://www.mongodb.org/display/DOCS/MapReduce

How to remove duplicate record in MongoDB by MapReduce?

答案 2 :(得分:4)

最简单的解决方案是使用setUnion(Mongo 2.6 +):

db.users.aggregate([
    {'$addFields': {'favorites.books': {'$setUnion': ['$favorites.books', []]}}}
])

另一种(更冗长的)版本基于@kynan的answer的想法,但保留了所有其他字段而未明确指定它们(Mongo 3.4 +):

> db.users.aggregate([
    {'$unwind': {
        'path': '$favorites.books',
        // output the document even if its list of books is empty
        'preserveNullAndEmptyArrays': true
    }},
    {'$group': {
        '_id': '$_id',
        'books': {'$addToSet': '$favorites.books'},
        // arbitrary name that doesn't exist on any document
        '_other_fields': {'$first': '$$ROOT'},
    }},
    {
      // the field, in the resulting document, has the value from the last document merged for the field. (c) docs
      // so the new deduped array value will be used
      '$replaceRoot': {'newRoot': {'$mergeObjects': ['$_other_fields', "$$ROOT"]}}
    },
    // this stage wouldn't be necessary if the field wasn't nested
    {'$addFields': {'favorites.books': '$books'}},
    {'$project': {'_other_fields': 0, 'books': 0}}
])

{ "_id" : ObjectId("4f6cd3c47156522f4f45b26f"), "name" : "robert", "favorites" : 
{ "books" : [ "The Art of Computer Programmning", "Graph Theory", "Algorithms in C++" ] } }    

答案 3 :(得分:0)

Mongo 4.4开始,$function聚合运算符允许应用自定义javascript函数来实现MongoDB查询语言不支持的行为。

例如,为了从数组中删除重复项:

// {
//   "favorites" : { "books" : [
//     "Algorithms in C++",
//     "The Art of Computer Programming",
//     "Graph Theory",
//     "Algorithms in C++"
//   ]},
//   "name" : "robert"
// }
db.collection.aggregate(
  { $set:
    { "favorites.books":
      { $function: {
          body: function(books) { return books.filter((v, i, a) => a.indexOf(v) === i) },
          args: ["$favorites.books"],
          lang: "js"
      }}
    }
  }
)
// {
//   "favorites" : { "books" : [
//     "Algorithms in C++",
//     "The Art of Computer Programming",
//     "Graph Theory"
//   ]},
//   "name" : "robert"
// }

这具有以下优点:

$function具有3个参数:

  • body,这是要应用的函数,其参数是要修改的数组。
  • args,其中包含body函数作为参数的记录中的字段。在我们的情况下,"$favorites.books"
  • lang,这是编写body函数的语言。当前仅js可用。

答案 4 :(得分:0)

rowData = [];
  colors = ['#123456', '#654321', '#abcabc', '#666666']

  constructor() {
    for (let i = 1; i < 20; i++) {
      this.rowData.push({ a: 'A' + i, b: 'B' + i});
    }
  }