Question

我的mongo db集合包含以下结构：

{
    "_id" : ObjectId("5889ce0d2e9bfa938c49208d"),
    "filewise_word_freq" : {
            "33236365" : [
                    [
                            "cluster",
                            4
                    ],
                    [
                            "question",
                            2
                    ],
                    [
                            "differ",
                            2
                    ],[
                            "come",
                            1
                    ]
            ],
            "33204685" : [
                    [
                            "node",
                            6
                    ],
                    [
                            "space",
                            4
                    ],
                    [
                            "would",
                            3
                    ],[
                            "templat",
                            1
                    ]
            ]
    },
    "file_root" : "socialcast",
    "main_cluster_name" : "node",
    "most_common_words" : [
            [
                    "node",
                    16
            ],
            [
                    "cluster",
                    7
            ],
                [
                        "n't",
                        3
                ]
        ]
}

我想在dict filewise_word_freq的文件名数组（在我的例子中是“33236365”，“33204685”等等）的数组中搜索一个值“node”。如果值（“node”）出现在文件名（33204685）的任何一个数组数组中，则应返回文件名（33204685）。

我尝试从stackoverflow的这个链接： enter link description here

我试图为我的用例执行它不起作用。最重要的是，我不知道如何仅返回文件名而不是整个对象或文档。

db.frequencydist.find({"file_root":'socialcast',"main_cluster_name":"node","filewise_word_freq":{$elemMatch:{$elemMatch:{$elemMatch:{$in:["node"]}}}}}).pretty().

它没有返回任何内容。请帮助我。

Answer 1

你可以尝试这样的事情。这将与node匹配作为查询的一部分，并返回filewise_word_freq.33204685作为投影的一部分。

db.collection.find({
    "file_root": 'socialcast',
    "main_cluster_name": "node",
    "filewise_word_freq.33204685": {
        $elemMatch: {
            $elemMatch: {
                $in: ["node"]
            }
        }
    }
}, {
    "filewise_word_freq.33204685": 1
}).pretty();

Answer 2

您选择的数据模型使查询甚至聚合变得非常困难。我建议修改你的文档模型。但是我认为你可以使用$ where

db.collection.find({"file_root": 'socialcast',
    "main_cluster_name": "node", $where : "for(var i in this.filewise_word_freq){for(var j in this.filewise_word_freq[i]){if(this.filewise_word_freq[i][j].indexOf("node")>=0){return true}}}"})

是的，这将返回整个文档，您可能需要从应用程序中过滤文件名。

您可能还希望看到map-reduce功能，但不推荐这样做。

另一种方法是通过functions来完成，函数在mongo服务器上运行并保存在一个特殊的集合中。

仍然回到数据库模型，如果有可能，请修改它。也许像是

{
    "_id" : ObjectId("5889ce0d2e9bfa938c49208d"),
    "filewise_word_freq" : [
              {
                    "fileName":"33236365",
                    "word_counts" : {
                       "cluster":4,
                       "question":2,
                       "differ":2,
                       "come":1
                    }
            },
            {
                    "fileName":"33204685",
                    "word_counts" : {
                       "node":6,
                       "space":4,
                       "would":3,
                       "template":1
                    }
            }
           ] 
    "file_root" : "socialcast",
    "main_cluster_name" : "node",
    "most_common_words" : [
            {
                    "node":16
            },
            {
                    "cluster":7
            },
                {
                        "n't":3
                }
        ]
}

在这些上运行聚合会容易得多。

对于此模型，聚合将类似于

db.collection.aggregate([
 {$unwind : "$filewise_word_freq"},
 {$match : {'filewise_word_freq.word_counts.node' : {$gte : 0}}},
 {$group :{_id: 1, fileNames : {$addToSet : "$filewise_word_freq.fileName"}}},
 {$project :{ _id:0}}
 ])

这将为您提供单个文档，其中包含单个字段fileNames，其中包含所有文件名

的列表

{
  fileNames : ["33204685"]
}

有没有办法在MongoDB中查询数组数组的dict

2 个答案: