Question

我有这样的NLP数据格式。

{ 
    "_id" : ObjectId("5b2cb405281fb45becc0553e"), 
    "Text" : "I",
     "Corpus" : "I need a new car." 
    "Instance" : NumberInt(1847451544)
}
{ 
    "_id" : ObjectId("5b2cb405281fb45becc0553f"), 
    "Text" : "need",
    "Corpus" : "I need a new car." 
    "Instance" : NumberInt(1847451544)
}
{ 
    "_id" : ObjectId("5b2cb405281fb45becc05540"), 
    "Text" : "a", 
    "Corpus" : "I need a new car.",  
    "Instance" : NumberInt(1847451544)
}
{ 
    "_id" : ObjectId("5b2cb405281fb45becc05541"), 
    "Text" : "new", 
    "Corpus" : "I need a new car.", 
    "Instance" : NumberInt(1847451544)
}
{ 
    "_id" : ObjectId("5b2cb405281fb45becc05542"), 
    "Text" : "car", 
    "Corpus" : "I need a new car.", 
    "Instance" : NumberInt(1847451544)
}
{ 
    "_id" : ObjectId("5b2cb405281fb45becc05543"), 
    "Text" : ".", 
    "Corpus" : "I need a new car.", 
    "Instance" : NumberInt(1847451544)
}

我想汇总一个MongoDB，以查找其中包含特定单词（例如“ car”）或多个单词（例如“ new” $and“ car”在一起）的句子一句话。 $or与$match as $or:[{Text:"new"},{Text:"car"}]一起使用时，我无法为此目的运行任何$and。

我还尝试了不同的情况，方法是使用$group和以下内容根据实例编号将它们分组（每个句子被标记成不同的片段，但带有相同的实例编号）。

{
_id:{Instance:"$Instance"},
Item:{$push: {Text:"$Text", Corpus:"$Corpus"}}
}

，并在管道的下一步中将$Match与$elementMatch一起使用，如下所示：

Item:{$elemMatch:{$or:[{Text:"new"},{Text:"car"}]}}

它再次适用于$or，但我却无能为力$and。

如果有人可以帮助我，如何构造管道，特别是针对我可以基于“实例”对句子进行分组的情况，

P.S“实例”是python中随机产生的大型数字，因此对于每个句子来说都是唯一的。

使用$ and检索MongoDB中单个行的多个属性

0 个答案: