我的Mongodb文档有以下结构,正如您所见,我有3个网址,每个网址都crawled
设置为True
或False
。
{
"_id": {
"$oid": "573b8e70e1054c00151152f7"
},
"domain": "example.com",
"good": [
{
"crawled": true,
"added": {
"$date": "2016-05-17T21:34:34.485Z"
},
"link": "/threads/11005-Cheap-booze!"
},
{
"crawled": false,
"added": {
"$date": "2016-05-17T21:34:34.485Z"
},
"link": "/threads/9445-This-week-s-voucher-codes"
},
{
"crawled": false,
"added": {
"$date": "2016-05-17T21:34:34.485Z"
},
"link": "/threads/9445-This-week-s-voucher-codes_2"
}
],
"link_found": false,
"subdomain": "http://www."
}
我尝试返回只返回crawled
设置为False
的网址的特定字段,为此我有以下查询:
.find({'good.crawled' : False}, {'good.link':True, 'domain':True, 'subdomain':True})
但是,返回的内容与预期的内容不同,因为它返回了所有网址,无论他们的crawled
状态是True
还是False
< / p>
返回的内容是:
{
u'domain': u'cashquestions.com',
u'_id': ObjectId('573b8e70e1054c00151152f7'),
u'subdomain': u'http://www.',
u'good': [
{
u'link': u'/threads/11005-Cheap-booze!'
},
{
u'link': u'/threads/9445-This-week-s-voucher-codes'
},
{
u'link': u'/threads/9445-This-week-s-voucher-codes_2'
}
]
}
预期结果:
{
u'domain': u'cashquestions.com',
u'_id': ObjectId('573b8e70e1054c00151152f7'),
u'subdomain': u'http://www.',
u'good': [
{
u'link': u'/threads/9445-This-week-s-voucher-codes'
},
{
u'link': u'/threads/9445-This-week-s-voucher-codes_2'
}
]
}
如何指定仅返回crawled
设置为False
的链接?
答案 0 :(得分:0)
您希望使用聚合框架(这将在MongoDB 3.0及更高版本中使用):
db.yourcolleciton.aggregate([
// optional: only those with at least one false
{$match: {'good.crawled': false}},
// get just the fields you need (plus _id)
{$project: {good:1, domain:1, subdomain: 1}},
// get each in a separate temporary document
{$unwind: {'good': 1}},
// limit to false
{$match: {'good.crawled': false}},
// undoes the $unwind
{$group: {_id: "$_id", domain: {"$first": "$domain"}, 'subdomain' : {$first, '$subdomain'}, good: {"$push":"$good"}}
])