我是MongoDB的新手,我很难尝试在数组中获取唯一的子文档。
我的收藏中的文档如下所示:
{
"PubDate": "1/01/01 00:00",
"Title": "Identification of DNA-Dependent Protein Kinase Catalytic Subunit (DNA-PKcs) as a Novel Target of Bisphenol A",
"Datums": [
{
"evidence_id": "3515620_6",
"evidence": [
"\n\nTo examine the interaction between DNA-PKcs and Ku70/Ku80 more directly, we performed immunoprecipitation (IP) using FLAG-Ku70 or FLAG-Ku80 recombinants, which were expressed in 293T cells after IR-irradiation (Fig. 4B\n ) or UV-irradiation (Fig. 4C\n ). After IR-irradiation, co-precipitation of DNA-PKcs with Ku80 increased compared with that in the non-irradiated controls (Fig. 4B\n lanes 7 and 8)."
],
"map": {
"change": [
{
"Text": "increased"
}
],
"subject": [
{
"Entity": {
"strings": [
"dna-pkcs"
],
"uniprotSym": "P78527"
}
}
],
"treatment": [
{
"Entity": {
"strings": [
"dna-pkcs"
],
"uniprotSym": "P78527"
}
}
],
"assay": [
{
"Text": "copptby"
}
]
}
},
{
"evidence_id": "3515620_6",
"evidence": [
"\n\nTo examine the interaction between DNA-PKcs and Ku70/Ku80 more directly, we performed immunoprecipitation (IP) using FLAG-Ku70 or FLAG-Ku80 recombinants, which were expressed in 293T cells after IR-irradiation (Fig. 4B\n ) or UV-irradiation (Fig. 4C\n ). After IR-irradiation, co-precipitation of DNA-PKcs with Ku80 increased compared with that in the non-irradiated controls (Fig. 4B\n lanes 7 and 8)."
],
"map": {
"change": [
{
"Text": "increased"
}
],
"subject": [
{
"Entity": {
"strings": [
"dna-pkcs"
],
"uniprotSym": "P78527"
}
}
],
"treatment": [
{
"Entity": {
"strings": [
"dna-pkcs"
],
"uniprotSym": "P78527"
}
}
],
"assay": [
{
"Text": "copptby"
}
]
}
},
{
"evidence_id": "3515620_6",
"evidence": [
"\n\nTo examine the interaction between DNA-PKcs and Ku70/Ku80 more directly, we performed immunoprecipitation (IP) using FLAG-Ku70 or FLAG-Ku80 recombinants, which were expressed in 293T cells after IR-irradiation (Fig. 4B\n ) or UV-irradiation (Fig. 4C\n ). After IR-irradiation, co-precipitation of DNA-PKcs with Ku80 increased compared with that in the non-irradiated controls (Fig. 4B\n lanes 7 and 8)."
],
"map": {
"change": [
{
"Text": "increased"
}
],
"subject": [
{
"Entity": {
"strings": [
"dna-pkcs"
],
"uniprotSym": "P78527"
}
}
],
"treatment": [
{
"Entity": {
"strings": [
"dna-pkcs"
],
"uniprotSym": "P78527"
}
}
],
"assay": [
{
"Text": "copptby"
}
]
}
}
],
"Volume": "7",
"FullJournalName": "PLoS ONE",
"Authors": "Ito Y, Ito T, Karasawa S, Enomoto T, Nashimoto A, Hase Y, Sakamoto S, Mimori T, Matsumoto Y, Yamaguchi Y, Handa H",
"Issue": "12",
"Pages": "e50481",
"PMCID": "3515620"
}
在上面的例子中," Datums" field只有一个子文档,但通常是" Datums"字段将有大约20-30个子文档。我希望我的MongoDB查询输出文档(满足某些条件),其中" Datums"字段将在其数组中具有唯一的子文档。为此,我使用以下MongoDB查询:
db.My_Datums.aggregate(
[
{ "$match": {
"Datums":
{
"$elemMatch":
{
"map.treatment.Entity.uniprotSym": { "$in": ["P33981", "P78527"] },
"map.assay.Text": "copptby"
}
}
}},
{ "$project": { "PMCID":1, "Title":1, "PubDate":1, "Volume":1, "Issue":1, "Pages":1, "FullJournalName":1, "Authors":1, "Datums.map.assay.Text":1, "Datums.map.change.Text":1, "Datums.map.subject.Entity.strings":1, "Datums.map.treatment.Entity.uniprotSym":1, "Datums.evidence_id":1, "_id":0 }},
{ "$unwind": "$Datums" },
{ "$match": { "Datums.map.treatment.Entity.uniprotSym": { "$in": ["P33981", "P78527"] }, "Datums.map.assay.Text": "copptby" }},
{ "$group": { "_id": "$PMCID", "Datums": { "$addToSet": "$Datums" }}}
]
#{ allowDiskUse: 1 }
)
但是在运行上面的命令时,我得到以下输出:
{u'Datums': [{u'evidence_id': u'3515620_6',
u'map': {u'assay': [{u'Text': u'copptby'}],
u'change': [{u'Text': u'increased'}],
u'subject': [{u'Entity': {u'strings': u'dna-pkcs'}}],
u'treatment': [{u'Entity': {u'uniprotSym': u'P78527'}}]}},
{u'evidence_id': u'3515620_6',
u'map': {u'assay': [{u'Text': u'copptby'}],
u'change': [{u'Text': u'increased'}],
u'subject': [{u'Entity': {u'strings': u'dna-pkcs'}}],
u'treatment': [{u'Entity': {u'uniprotSym': u'P78527'}}]}},
{u'evidence_id': u'3515620_6',
u'map': {u'assay': [{u'Text': u'copptby'}],
u'change': [{u'Text': u'increased'}],
u'subject': [{u'Entity': {u'strings': u'dna-pkcs'}}],
u'treatment': [{u'Entity': {u'uniprotSym': u'P78527'}}]}}],
u'_id': u'3515620'}
我不理解为什么addToSet将重复的子文档添加到" Datums"。有什么方法可以过滤掉重复的内容吗?我在查询中做错了什么?我已经搜索了很多并且阅读了很多,但无法找到任何解决方案。哪有MongoDB大师可以帮助这个菜鸟?我将永远感激你!
提前致谢!