这是我存储的数据:
{ "_id" : ObjectId("57080a7b01351177a4113f63"), "title" : "Data Scientist", "url" : "https://www.Pinterest.com/jobs/732?t=nu6xow", "timestamp" : "2016-04-08 19:46:03", "company" : "Pinterest", "state" : " CA", "todays_date" : "04/08/2016", "city_name" : "San+Francisco", "location" : "San Francisco, CA", "team" : "T0BT323QS", "search_word" : "Data+scientist"}
{ "_id" : ObjectId("57080a7b01351177a4113f64"), "title" : "Director of Analytics / Data Mining", "url" : "http://www.Pinterest.com/careers-position-data-mining-leader", "timestamp" : "2016-04-08 19:46:03", "company" : "Pinterest", "state" : " CA", "todays_date" : "04/08/2016", "city_name" : "San+Francisco", "location" : "Silicon Valley, CA", "team" : "T0BT323QS", "search_word" : "Data+scientist"}
{ "_id" : ObjectId("57080a7d01351177a4113f65"), "title" : "Senior Real World Data Scientist", "url" : "http://www.Pinterest.com/careers/detail/00443369/Senior-Real-World-Data-Scientist?src=JB-12568", "timestamp" : "2016-04-08 19:46:05", "company" : "Pinterest", "state" : " CA", "todays_date" : "04/08/2016", "city_name" : "San+Francisco", "location" : "South San Francisco, CA", "team" : "T0BT323QS", "search_word" : "Data+scientist"}
这是我的疑问:
db.Books.aggregate([{$match:{"timestamp":{
$gte: "2016-04-08 19:46:03", $lt: "2016-04-08 19:46:06"}}}
,{ "$group": {
"_id": "$company",
"count": { "$sum": 1 },
"urls": {
"$addToSet": "$url"
}
}},
{ "$sort": { "count": -1 } },
{ "$limit": 10 },
{ "$project": {
"count": 1,
"urls": { "$slice": ["$urls",0, 3] }
}}
])
这是输出:
{
"_id" : "Pinterest",
"urls" : [
"https://www.Pinterest.com/jobs/732?t=nu6xow",
"http://www.Pinterest.com/careers-position-data-mining-leader",
"http://www.Pinterest.com/careers/detail/00443369/Senior-Real-World-Data-Scientist?src=JB-12568"
]
}
然而,与“url”一起,我希望它显示相应的“标题”和“位置”字段。像这样:
{
"_id" : "Pinterest",
"urls" : [
[
"https://www.Pinterest.com/jobs/732?t=nu6xow",
"Data Scientist","San Francisco, CA"
],[
"http://www.Pinterest.com/careers-position-data-mining-leader",
"Director of Analytics / Data Mining","Silicon Valley, CA"
],[
"http://www.Pinterest.com/careers/detail/00443369/Senior-Real-World-Data-Scientist?src=JB-12568",
"Senior Real World Data Scientist",
"South San Francisco, CA"
]
]}
答案 0 :(得分:1)
获取类似文档的方法是将$ push文件与选定的文件添加到urls数组
db.a1.aggregate([{$match:{"timestamp":{
$gte: "2016-04-08 19:46:03", $lt: "2016-04-08 19:46:06"}}}
,{ "$group": {
"_id": "$company",
"count": { "$sum": 1 },
"urls": {
"$push": {url:"$url", title:"$title", location:"$location"}
}
}},
{ "$sort": { "count": -1 } },
{ "$limit": 10 },
{ "$project": {
"count": 1,
"urls": { "$slice": ["$urls",0, 3] }
}}
])
然后你就可以得到像这样的文档:
{
"_id" : "Pinterest",
"count" : 3,
"urls" : [
{
"url" : "https://www.Pinterest.com/jobs/732?t=nu6xow",
"title" : "Data Scientist",
"location" : "San Francisco, CA"
},
{
"url" : "http://www.Pinterest.com/careers-position-data-mining-leader",
"title" : "Director of Analytics / Data Mining",
"location" : "Silicon Valley, CA"
},
{
"url" : "http://www.Pinterest.com/careers/detail/00443369/Senior-Real-World-Data-Scientist?src=JB-12568",
"title" : "Senior Real World Data Scientist",
"location" : "South San Francisco, CA"
}
]
}
欢迎提出任何问题!
玩得开心!
答案 1 :(得分:1)
对于MongoDB 2.6到3.2版本,您需要$map
的一些帮助:
db.Books.aggregate([
{ "$match":{
"timestamp":{
"$gte": "2016-04-08 19:46:03", "$lt": "2016-04-08 19:46:06"
}
}},
{ "$group": {
"_id": "$company",
"count": { "$sum": 1 },
"urls": {
"$push": {
"$map": {
"input": [ "A", "B", "C" ],
"as": "el",
"in": {
"$cond": [
{ "$eq": [ "$$el", "A" ] },
"$url",
{ "$cond": [
{ "$eq": [ "$$el", "B" ] },
"$title",
"$location"
]}
]
}
}
}
}
}},
{ "$sort": { "count": -1 } },
{ "$limit": 10 },
{ "$project": {
"count": 1,
"urls": { "$slice": ["$urls",0, 3] }
}}
])
这就是你如何将每个项目标记为数组。
你可能真的应该这样做,而不是:
db.Books.aggregate([
{ "$match":{
"timestamp":{
"$gte": "2016-04-08 19:46:03", "$lt": "2016-04-08 19:46:06"
}
}},
{ "$group": {
"_id": "$company",
"count": { "$sum": 1 },
"urls": {
"$push": {
"url": "$url",
"title": "$title",
"location": "$location"
}
}
}},
{ "$sort": { "count": -1 } },
{ "$limit": 10 },
{ "$project": {
"count": 1,
"urls": { "$slice": ["$urls",0, 3] }
}}
])
因为它确实按键识别字段。但如果由于某种原因你更喜欢数组格式,那么你就可以这样做。
对于$addToSet
,只需将$push
替换为$addToSet
,但如果不是所有字段都是唯一的,则$group
位于"url"
1}}属性第一:
db.Books.aggregate([
{ "$match":{
"timestamp":{
"$gte": "2016-04-08 19:46:03", "$lt": "2016-04-08 19:46:06"
}
}},
{ "$group": {
"_id": {
"company": "$company",
"url": "$url"
},
"title": { "$first": "$title" },
"location": { "$first": "$location" },
"count": { "$sum": 1 }
}},
{ "$group": {
"_id": "$_id.company",
"count": { "$sum": "$count" },
"urls": {
"$push": {
"$map": {
"input": [ "A", "B", "C" ],
"as": "el",
"in": {
"$cond": [
{ "$eq": [ "$$el", "A" ] },
"$_id.url",
{ "$cond": [
{ "$eq": [ "$$el", "B" ] },
"$title",
"$location"
]}
]
}
}
}
}
}},
{ "$sort": { "count": -1 } },
{ "$limit": 10 },
{ "$project": {
"count": 1,
"urls": { "$slice": ["$urls",0, 3] }
}}
])