我应该如何将具有重复密钥的数据存储在MongoDB的文档中?

时间:2013-09-10 03:11:57

标签: mongodb document duplicates

我在将一些数据存储到MongoDB时遇到了一个问题。为简单起见,其结构如下所示:

FEATURES             Location/Qualifiers
     source          1..4242774
                     /organism="Bacillus amyloliquefaciens subsp. plantarum YAU
                     B9601-Y2"
                     /mol_type="genomic DNA"
                     /strain="YAU B9601-Y2"
                     /sub_species="plantarum"
                     /db_xref="taxon:1155777"
     gene            412..1752
                     /gene="dnaA"
                     /locus_tag="BANAU_0001"
     CDS             412..1752
                     /gene="dnaA"
                     /locus_tag="BANAU_0001"
                     /function="ATPase involved in DNA replication initiation"
                     /codon_start=1
                     /transl_table=11
                     /product="Chromosomal replication initiator protein dnaA"
                     /protein_id="CCG48023.1"
                     /db_xref="GI:380496985"
                     /db_xref="GOA:H8XCH4"
                     /db_xref="UniProtKB/TrEMBL:H8XCH4"
                     /translation="MENILDLWNQALAQIEKKLSKPSFETWMKSTKAHSLQGDTLTIT
                     APNEFARDWLESRYLHLIADTIYELTGEELSVKFVIPQNQDEEDFLPKPQVKKAAKEE
                     PSDFPQSMLNPKYTFDTFVIGSGNRFAHAASLAVAEAPAKAYNPLFIYGGVGLGKTHL
                     MHAIGHYVIDHNPSAKVVYLSSEKFTNEFINSIRDNKAVDFRNRYRNVDVLLIDDIQF
                     LAGKEQTQEEFFHTFNTLHEESKQIVISSDRPPKEIPTLEDRLRSRFEWGLITDITPP
                     DLETRIAILRKKAKAEGLDIPNEVMLYIANQIDSNIRELEGALIRVVAYSSLINKDIN
                     ADLAAEALKDIIPSSKPKVITIKEIQRIVGQQFNIKLEDFKAKKRTKSVAFPRQIAMY
                     LSREMTDSSLPKIGEEFGGRDHTTVIHAHEKISKLLIDDEQLQQQVKEIKELLK"
     gene            1937..3073
                     /gene="dnaN"
                     /locus_tag="BANAU_0002"
     CDS             1937..3073
                     /gene="dnaN"

从我的插图中,您可以看到关键的“基因”和“CDS”将重复多次,并且从MongoDB,我知道文档中的重复键是被禁止的。所以,我的问题是,应该怎样我在mongoDB中组织数据结构?

1 个答案:

答案 0 :(得分:0)

不是在文档中的同一级别对重复字段进行建模,而是可以为重复元素(如genes)使用子文档数组,例如:

{
    source: "1..4242774",
    organism: "Bacillus amyloliquefaciens subsp. plantarum YAUB9601-Y2",
    mol_type: "genomic DNA",
    strain: "YAU B9601-Y2",
    sub_species: "plantarum",
    db_xref: "taxon:1155777",
    genes: [
        {
            id: '412..1752',
            gene: "dnaA",
        },
        {
            id: '1937..3073',
            gene: "dnaN",
        }
    ],
    CDS: [
        {
            id: '412..1752',
        }
    ]
}

有关详细信息,请参阅MongoDB手册中的Array of Subdocuments