在列表中显示mongo中的相应字段

时间:2016-04-20 19:31:27

标签: mongodb mongodb-query aggregation-framework

这是我存储的数据:

{ "_id" : ObjectId("57080a7b01351177a4113f63"), "title" : "Data Scientist", "url" : "https://www.Pinterest.com/jobs/732?t=nu6xow", "timestamp" : "2016-04-08 19:46:03", "company" : "Pinterest", "state" : " CA", "todays_date" : "04/08/2016", "city_name" : "San+Francisco", "location" : "San Francisco, CA", "team" : "T0BT323QS", "search_word" : "Data+scientist"}
{ "_id" : ObjectId("57080a7b01351177a4113f64"), "title" : "Director of Analytics / Data Mining", "url" : "http://www.Pinterest.com/careers-position-data-mining-leader", "timestamp" : "2016-04-08 19:46:03", "company" : "Pinterest", "state" : " CA", "todays_date" : "04/08/2016", "city_name" : "San+Francisco", "location" : "Silicon Valley, CA", "team" : "T0BT323QS", "search_word" : "Data+scientist"}
{ "_id" : ObjectId("57080a7d01351177a4113f65"), "title" : "Senior Real World Data Scientist", "url" : "http://www.Pinterest.com/careers/detail/00443369/Senior-Real-World-Data-Scientist?src=JB-12568", "timestamp" : "2016-04-08 19:46:05", "company" : "Pinterest", "state" : " CA", "todays_date" : "04/08/2016", "city_name" : "San+Francisco", "location" : "South San Francisco, CA", "team" : "T0BT323QS", "search_word" : "Data+scientist"}

这是我的疑问:

db.Books.aggregate([{$match:{"timestamp":{
       $gte: "2016-04-08 19:46:03", $lt: "2016-04-08 19:46:06"}}}
     ,{ "$group": {
        "_id": "$company",
        "count": { "$sum": 1 },
        "urls": {
            "$addToSet": "$url"
        }
    }},
    { "$sort": { "count": -1 } },
    { "$limit": 10 },
    { "$project": {
        "count": 1,
        "urls": { "$slice": ["$urls",0, 3] }
    }}
])

这是输出:

{ 
    "_id" : "Pinterest", 
    "urls" : [ 
        "https://www.Pinterest.com/jobs/732?t=nu6xow",
        "http://www.Pinterest.com/careers-position-data-mining-leader",
        "http://www.Pinterest.com/careers/detail/00443369/Senior-Real-World-Data-Scientist?src=JB-12568" 
    ] 
}

然而,与“url”一起,我希望它显示相应的“标题”和“位置”字段。像这样:

{ 
    "_id" : "Pinterest", 
    "urls" : [ 
        [
            "https://www.Pinterest.com/jobs/732?t=nu6xow",
            "Data Scientist","San Francisco, CA"
        ],[
            "http://www.Pinterest.com/careers-position-data-mining-leader",
            "Director of Analytics / Data Mining","Silicon Valley, CA"
        ],[
            "http://www.Pinterest.com/careers/detail/00443369/Senior-Real-World-Data-Scientist?src=JB-12568",
            "Senior Real World Data Scientist",
            "South San Francisco, CA"
        ]
]}

2 个答案:

答案 0 :(得分:1)

获取类似文档的方法是将$ push文件与选定的文件添加到urls数组

db.a1.aggregate([{$match:{"timestamp":{
       $gte: "2016-04-08 19:46:03", $lt: "2016-04-08 19:46:06"}}}
     ,{ "$group": {
        "_id": "$company",
        "count": { "$sum": 1 },
        "urls": {
            "$push": {url:"$url", title:"$title", location:"$location"}
        }
    }},
    { "$sort": { "count": -1 } },
    { "$limit": 10 },
    { "$project": {
        "count": 1,
        "urls": { "$slice": ["$urls",0, 3] }
    }}
])

然后你就可以得到像这样的文档:

   {
    "_id" : "Pinterest",
    "count" : 3,
    "urls" : [ 
        {
            "url" : "https://www.Pinterest.com/jobs/732?t=nu6xow",
            "title" : "Data Scientist",
            "location" : "San Francisco, CA"
        }, 
        {
            "url" : "http://www.Pinterest.com/careers-position-data-mining-leader",
            "title" : "Director of Analytics / Data Mining",
            "location" : "Silicon Valley, CA"
        }, 
        {
            "url" : "http://www.Pinterest.com/careers/detail/00443369/Senior-Real-World-Data-Scientist?src=JB-12568",
            "title" : "Senior Real World Data Scientist",
            "location" : "South San Francisco, CA"
        }
    ]
}

欢迎提出任何问题!

玩得开心!

答案 1 :(得分:1)

对于MongoDB 2.6到3.2版本,您需要$map的一些帮助:

db.Books.aggregate([
   { "$match":{ 
       "timestamp":{
           "$gte": "2016-04-08 19:46:03", "$lt": "2016-04-08 19:46:06"
       }
   }},
   { "$group": {
       "_id": "$company",
       "count": { "$sum": 1 },
       "urls": {
            "$push": {
                "$map": {
                    "input": [ "A", "B", "C" ],
                    "as": "el",
                    "in": {
                      "$cond": [
                        { "$eq": [ "$$el", "A" ] },
                        "$url",
                        { "$cond": [
                          { "$eq": [ "$$el", "B" ] },
                          "$title",
                          "$location"
                        ]}
                      ]
                    }
                }
            }
       }
   }},
   { "$sort": { "count": -1 } },
   { "$limit": 10 },
   { "$project": {
     "count": 1,
     "urls": { "$slice": ["$urls",0, 3] }
  }}
])

这就是你如何将每个项目标记为数组。

你可能真的应该这样做,而不是:

db.Books.aggregate([
   { "$match":{ 
       "timestamp":{
           "$gte": "2016-04-08 19:46:03", "$lt": "2016-04-08 19:46:06"
       }
   }},
   { "$group": {
       "_id": "$company",
       "count": { "$sum": 1 },
       "urls": {
            "$push": {
               "url": "$url",
               "title": "$title",
               "location": "$location"
            }
       }
   }},
   { "$sort": { "count": -1 } },
   { "$limit": 10 },
   { "$project": {
     "count": 1,
     "urls": { "$slice": ["$urls",0, 3] }
  }}
])

因为它确实按键识别字段。但如果由于某种原因你更喜欢数组格式,那么你就可以这样做。

对于$addToSet,只需将$push替换为$addToSet,但如果不是所有字段都是唯一的,则$group位于"url" 1}}属性第一:

db.Books.aggregate([
   { "$match":{ 
       "timestamp":{
           "$gte": "2016-04-08 19:46:03", "$lt": "2016-04-08 19:46:06"
       }
   }},
   { "$group": {
     "_id": { 
        "company": "$company",
        "url": "$url"
     },
     "title": { "$first": "$title" },
     "location": { "$first": "$location" },
     "count": { "$sum": 1 }
   }},
   { "$group": {
       "_id": "$_id.company",
       "count": { "$sum": "$count" },
       "urls": {
            "$push": {
                "$map": {
                    "input": [ "A", "B", "C" ],
                    "as": "el",
                    "in": {
                      "$cond": [
                        { "$eq": [ "$$el", "A" ] },
                        "$_id.url",
                        { "$cond": [
                          { "$eq": [ "$$el", "B" ] },
                          "$title",
                          "$location"
                        ]}
                      ]
                    }
                }
            }
       }
   }},
   { "$sort": { "count": -1 } },
   { "$limit": 10 },
   { "$project": {
     "count": 1,
     "urls": { "$slice": ["$urls",0, 3] }
  }}
])