许多小时以来,我一直在努力地思考如何做,我有一个名为“工作”的集合-内部,有一个子集合“站点”,即Jobs.site。该网站子集合的属性为“ UNID”。
我正在尝试基于文本搜索从数据库中检索文档,效果很好。
但是我试图仅检索基于该Job.Site.UNID的UNIQUE文档,并可能添加了一个计数作为额外属性。结果应如下所示:
Job: { site: { field1: 'EXAMPLE', UNID: 'SITEID', count: 5 } }
这意味着作业集中有5个具有该site.UNID的作业。
这是我到目前为止所拥有的:
[
// GETTING DOCS BASED ON TEXT SEARCH RESULTS
{
$match: {
// clientId: req.user.client_id,
$text: { $search: body.searchTerms }
}
},
// SORTING THEM BASED ON TEXTSCORE
{ $sort: { score: { $meta: 'textScore' } } },
// THE PROBLEMATIC GROUPING PART
{ $group: { site: { UPRN: '$UPRN', myCount: { $sum: 1 } } } },
// I ONYL WANT TO GET 20 DOCS AT A TIME
{ $limit: 20 },
// THE DATA THAT I WANT IN MY DOCUMENTS, MAYBE COUNT WOULD COME HERE?
{
$project: {
site: true,
score: { $meta: 'textScore' }
}
},
// GETTING RID OF POOR MATCHES BASED ON A SCORE CALCULATED IN ANOTHER
// FUNCTION BASED ON THE NUMBER OF WORDS IN THE TEXT SEARCH
{
$match: {
score: { $gt: matchScore }
}
}
]
这里用The field 'site' must be an accumulator object
打我
因此,我无法弄清楚该子集合属性可以正常使用的语法。
编辑:由于@Anthony的出色表现,V2进行了出色的测试,并对其进行了全面的测试,只是它似乎不计算工作总数,它始终为1或我在$ sum中设置的值:但是有200多个结果,仍然可以使用在上面。
{ $match: { $text: { $search: body.searchTerms } } },
{ $sort: { $score: { $meta: 'textScore' } } },
// { $match: { score: { $gt: 0.1 } } },
{
$group: {
_id: '$UNID',
counter: { $sum: 1 },
score: { $first: { $meta: 'textScore' } },
title: { $first: '$title' },
postcode: { $first: '$postcode' },
addressLine1: { $first: '$addressLine1' },
city: { $first: '$city' },
projectName: { $first: '$projectName' },
jobsCount: { $sum: '$counter' }
}
},
{ $limit: 20 },
{
$project: {
UNID: '$_id',
title: '$title',
postcode: '$postcode',
addressLine1: '$addressLine1',
projectName: '$projectName',
city: '$city',
score: 1,
jobsCount: true
}
}
示例数据:
{
"_id": "randomString0",
"title": "Quality",
"site": {
"_id": "rKFRbvH8CEbJYdzDs",
"title": "Title 1",
"addressLine1": "address1",
"UNID": "001",
"city": "cityName",
"createdAt": null
}
},
{
"_id": "randomString1",
"title": "Some2123",
"site": {
"_id": "rKFRbvH8CEbJYdzDs",
"title": "Title 1",
"addressLine1": "address1",
"UNID": "001",
"city": "cityName",
"createdAt": null
}
},
{
"_id": "randomString2",
"title": "Random title",
"site": {
"_id": "rKFRbvH8CEbJYdzDs",
"title": "Title 1",
"addressLine1": "address1",
"UNID": "001",
"city": "cityName",
"createdAt": null
}
},
{
"_id": "randomString3",
"title": "Another unique job",
"site": {
"_id": "rKFRbvH8CEbJYdzDs",
"title": "Title 1",
"addressLine1": "address1",
"UNID": "001",
"city": "cityName",
"createdAt": null
}
},
{
"_id": "randomString4",
"title": "Other thing",
"site": {
"_id": "rKFRbvH8CEbJYdzDs",
"title": "Title 1",
"addressLine1": "address1",
"UNID": "001",
"city": "cityName",
"createdAt": null
}
},
{
"_id": "randomString5",
"title": "Something else",
"site": {
"_id": "rKFRbvH8CEbJYdzDs",
"title": "Title 1",
"addressLine1": "address1",
"UNID": "001",
"city": "cityName",
"createdAt": null
}
}
如您所见,站点数据在所有这5个文档中始终是唯一的,但计数器应计算出多少个文档具有相同的唯一性
答案 0 :(得分:1)
在$group
阶段,_id
(您要分组到的)表达式是必需的表达式。而且accumulators
聚合阶段中只能使用少数$group
。
所以您的聚合必须是这样的
[
{ "$match": { "$text": { "$search": body.searchTerms }}},
{ "$sort": { "score": { "$meta": "textScore" } } },
{ "$match": { "score": { "$gt": matchScore }}},
{ "$group": {
"_id": "$UPRN",
"myCount": { "$sum": 1 },
"score": { "$first": "$score" }
}},
{ "$limit": 20 },
{ "$project": {
"site": "$_id",
"score": 1,
"myCount": 1
}}
]