如何删除MongoDB文档中数组内的重复值?

时间:2014-01-13 09:30:32

标签: mongodb

这是示例示例json,在文本数组中我有重复值,如“SEQUENCE”:1重复,我只想存在一条记录,可以建议如何删除重复记录。我有100个这样的文件。

请建议我使用一个查询来执行此操作。

"description": {
      "content": [],
      "details": [
        {
          "CONTENT_TYPE_ID": 0,
          "DESCRIPTION_NAME": "Bullets",
          "DESCRIPTION_TYPE_ID": 2,
          "TAB_SEQ": 0,
          "TEXT": [
            {
              "SEQUENCE": 1,
              "DESCRIPTION": "Double sided print, allows complete application on one sheet"
            },
            {
              "SEQUENCE": 1,
              "DESCRIPTION": "Double sided print, allows complete application on one sheet"
            },
            {
              "SEQUENCE": 1,
              "DESCRIPTION": "Double sided print, allows complete application on one sheet"
            },
            {
              "SEQUENCE": 2,
              "DESCRIPTION": "8-1/2\" x 11\""
            },
            {
              "SEQUENCE": 2,
              "DESCRIPTION": "8-1/2\" x 11\""
            },
            {
              "SEQUENCE": 2,
              "DESCRIPTION": "8-1/2\" x 11\""
            },
            {
              "SEQUENCE": 3,
              "DESCRIPTION": "One-part"
            },
            {
              "SEQUENCE": 3,
              "DESCRIPTION": "One-part"
            },
            {
              "SEQUENCE": 3,
              "DESCRIPTION": "One-part"
            },
            {
              "SEQUENCE": 4,
              "DESCRIPTION": "3-hole punched"
            },
            {
              "SEQUENCE": 4,
              "DESCRIPTION": "3-hole punched"
            },
            {
              "SEQUENCE": 4,
              "DESCRIPTION": "3-hole punched"
            },
            {
              "SEQUENCE": 5,
              "DESCRIPTION": "50 forms per pack, 2 packs included"
            },
            {
              "SEQUENCE": 5,
              "DESCRIPTION": "50 forms per pack, 2 packs included"
            },
            {
              "SEQUENCE": 5,
              "DESCRIPTION": "50 forms per pack, 2 packs included"
            },
            {
              "SEQUENCE": 6,
              "DESCRIPTION": "Employment and other laws change periodically.  Check the laws in your jurisdiction to see if this form is acceptable."
            },
            {
              "SEQUENCE": 6,
              "DESCRIPTION": "Employment and other laws change periodically.  Check the laws in your jurisdiction to see if this form is acceptable."
            },
            {
              "SEQUENCE": 6,
              "DESCRIPTION": "Employment and other laws change periodically.  Check the laws in your jurisdiction to see if this form is acceptable."
            }
          ]
        }]
    }   

1 个答案:

答案 0 :(得分:0)

我认为你可以使用聚合管道建立一个新的集合,我的方法是:

  1. $ unwind TEXT
  2. $ group以消除重复
  3. $ group再次重建文档
  4. $ project完成重建
  5. $ out in另一个系列

    db.description.aggregate([  {$ unwind:“$ details.TEXT”},  {$ gruop:{“_ id”:{“key”:“$ _id”,“TEXT”:“$ details.TEXT”}}},  {$ gruop:{“_ id”:“$ _id.key”,“details.TEXT”:{$ push:“$ _id.TEXT”}}} ]);

  6. 我没有测试它,你将不得不处理细节,但我认为这将完成这项工作!