这是我遇到的问题。我需要对文档的层次结构执行文本搜索,如下所示:
// simplified mongoose schema definitions
const opp = new Schema({
activity: { type: ObjectId, ref: 'Activity' },
team: { type: ObjectId, ref: 'Team' },
contribs: [{ type: ObjectId, ref: 'Contribution' }]
});
const activitySchema = new Schema({
name: { type: 'String', required: true },
language: { type: 'String', required: true, default: 'en' },
location: { type: 'String', required: true },
org: { type: ObjectId, ref: 'Org' },
description: { type: 'String', required: true },
translation: [translationSchema]
});
const orgSchema = new Schema({
name: { type: 'String', required: true },
language: { type: 'String', required: true, default: 'en' },
description: { type: 'String', required: true },
translation: [translationSchema]
});
据此,我希望执行全文搜索以检索搜索字词出现的所有商机:
首先,我为mongo创建了文本索引来执行全文搜索:
// schema indexes
orgSchema.index({ name: 'text', description: 'text', 'translation.name': 'text', 'translation.description': 'text' }, { name: 'fulltext-search' });
activitySchema.index({ name: 'text', location: 'text', description: 'text', 'translation.name': 'text', 'translation.location': 'text', 'translation.description': 'text' }, { name: 'fulltext-search' });
麻烦来了...... Mongo的聚合不能在引用的文档上执行,因此我必须在每个集合中独立地聚合文本搜索的结果。
// Search text on Orgs
const scoredOrgs = await Org.aggregate(
{ $match: { $text: { $search: searchTerms } } },
{ $project: { _id: 1, score: { $meta: 'textScore' } } })
.exec()
.then(scores => {
const scoreMap = {};
scores.forEach(s => { scoreMap[s._id] = s.score; });
return scoreMap;
});
// return example
[
{ '58bd295f2464490015b8ae31': 0.9119718309859155 },
{ '58bd85a42464490015b8ae9e': 1.8068181818181817 }
]
// Search text on Activities
const scoredActivities = await Activity.aggregate(
{ $match:
{ $or: [
{ $text: { $search: searchTerms } },
{ org: { $in: Object.keys(scoredOrgs).map(ObjectId) } }
]}
},
{ $project: { _id: 1, org: 1, score: { $meta: 'textScore' }}})
.exec()
.then(scores => {
const scoreMap = {};
scores.forEach(s => { scoreMap[s._id] = s.score + (scoredOrgs[s.org] || 0); });
return scoreMap;
});
//return example
{
'58bd50912464490015b8ae43': 1.5369718309859155,
'58bd86ae2464490015b8aea0': 2.4068181818181817,
'58bd4f642464490015b8ae3f': 0.6,
'58bd2a3c2464490015b8ae33': 0.9119718309859155,
'58bd2b0e2464490015b8ae35': 0.9119718309859155
}
我承认这会变得混乱,但此时我有一个未排序的数组,其中包含活动ID以及其字段和其基础文档(组织)字段搜索的总分。
最后,我将能够找到并排序机会'文档作为匹配的活动ID的子集,然后根据计算的分数对它们进行排序。
// Final Opps aggregation sorting and population
const opps = await Opportunity.find({
team: ObjectId('58bd76792464490015b8ae74'),
activity: { $in: Object.keys(scoredActivities).map(ObjectId) }
})
.populate('team')
.deepPopulate('activity.org')
.exec()
.then(opps => {
opps.map(opp => Object.assign(opp, { score: scoredActivities[opp.activity._id] }));
return opps.sort((x,y) => x.score < y.score);
});
我发现这是低效的,因为它需要几个单独的聚合。有没有更好的方法来实现这一目标?
我是否需要从聚合框架中处理数据,还是可以在管道中完成?
虽然我真的相信这个搜索案例很常见。 非常感谢你的帮助。