python在几次迭代后重置我的键/值

时间:2016-01-31 06:53:59

标签: python dictionary

我不确定这是否是我正在做的代码语法错误,或者是一些奇怪的pythonic迭代: 作为较长代码的一部分,我提供了一个输入文件" Input.txt"。代码应该是:

  1. 迭代输入文件的每一行

  2. 使用第三列生成"键"一本空字典

  3. 使用第一列作为"值"对于每行中的相应键
  4. 如果字典中存在密钥(即第三列中的内容),则只需附加值(第1列)。
  5. 问题:由于某种原因,python在5次迭代后重置值/键。为了使事情更清楚并尝试追踪错误,我打印出了代码运行产生的过程。

    输入文件:

    MouseGene   m_gene_FC   MouseLncRNA m_lnc_FC    HumanGene   h_gene_FC   HumanLncRNA h_lnc_FC    #_genes_Tested
    Spata1  0.472455825 Gm20645 0.507222015 Spata1  0.472455825 Gm20645 0.507222015 1109
    XX  0.472455825 Gm20645 0.507222015 Spata1  0.472455825 Gm11216 0.031375848 1109
    YY  0.472455825 Gm20645 0.507222015 Spata1  0.472455825 Gm26964 0.372023062 1109
    ZZ  0.472455825 Gm20645 0.507222015 Spata1  0.472455825 1110019D14Rik   0.272607682 1109
    JJ  0.472455825 Gm20645 0.507222015 Spata1  0.472455825 C430042M11Rik   0.062670386 1109
    Spata1  0.472455825 Gm20645 0.507222015 Spata1  0.472455825 Gm13166 0.210586702 1109
    Spata1  0.472455825 Gm20645 0.507222015 Spata1  0.472455825 Gm26825 0.043691414 1109
    

    代码:

    mouse_dict = {}
    infile=open("Input.txt", "r")
    for line in infile.readlines()[1:]: #skips header
        cols = line.rstrip().split('\t')
        if cols[2] in mouse_dict and cols[0] not in mouse_dict[cols[2]]: #if key is there, but the value is not, then append it
            mouse_dict[cols[2]].append(cols[0])
            print "key:", cols[2], "is there but value", cols[0], "is not"
            print "Values for", cols[2], "are now:", mouse_dict[cols[2]]
        else:
            mouse_dict[cols[2]] = [cols[0]]
            print "key:", cols[2], "is not there and value", cols[0], "is added"
    
    print "My final dictionary items are:", mouse_dict.items()
    

    我最终在屏幕上获得以下输出:

    key: Gm20645 is not there and value Spata1 is added
    key: Gm20645 is there but value XX is not
    Values for Gm20645 are now: ['Spata1', 'XX']
    key: Gm20645 is there but value YY is not
    Values for Gm20645 are now: ['Spata1', 'XX', 'YY']
    key: Gm20645 is there but value ZZ is not
    Values for Gm20645 are now: ['Spata1', 'XX', 'YY', 'ZZ']
    key: Gm20645 is there but value JJ is not
    Values for Gm20645 are now: ['Spata1', 'XX', 'YY', 'ZZ', 'JJ']
    key: Gm20645 is not there and value Spata1 is added
    key: Gm20645 is not there and value Spata1 is added
    My final dictionary items are: [('Gm20645', ['Spata1'])]
    

    我希望Gm20645的关键是[' Spata1',' XX',' YY',' ZZ',&# 39; JJ']作为最终键输出。

    正如您所知,只有" Spata1"在迭代之后离开,并且从行中可以看到某人Gm20645键丢失了: key:Gm20645不存在,并且添加了值Spata1

    我的原始文件包含> 1000行,所以我最初认为这是内存问题。但是,即使我将其切割成上面的小行,我也会收到此错误(正如上例中所示)。我还认为python允许字典中每个键的最大值数,然后自动重置,但我没有找到任何证据证明这是真的。我从未遇到过这样的错误,我无法找到解决方案。任何帮助将不胜感激。

2 个答案:

答案 0 :(得分:1)

你在If条件下出错了。要将其插入列表,条件必须为true。因为它是and操作。因此,当Spata1的not in失败时,它会转到其他位置。

尝试这样的事情。

if cols[2] in mouse_dict:
    if cols[0] not in mouse_dict[cols[2]]:
        mouse_dict[cols[2]].append(cols[0])
        print "key:", cols[2], "is there but value", cols[0], "is not"
        print "Values for", cols[2], "are now:", mouse_dict[cols[2]]
else:
    mouse_dict[cols[2]] = [cols[0]]
    print "key:", cols[2], "is not there and value", cols[0], "is added"

答案 1 :(得分:1)

确实,这是我的结尾的代码错误:当密钥在字典中并且cols [0]在mouse_dict [cols [2]]中(即值存在)时,它会跳过if并转到“ else“使我的字典重置为新键和新值的语句,并继续。要解决此问题,请使用以下代码:

mouse_dict = {}
infile=open("Input.txt", "r")
for line in infile.readlines()[1:]: #skips header
    cols = line.rstrip().split('\t')
    if cols[2] not in mouse_dict.keys(): #First it checks if key is NOT there. if it is not, it adds it. if the key is in the dictionary it goes to the elif
        mouse_dict[cols[2]] = [cols[0]]
        print "key:", cols[2], "is not there and value", cols[0], "is added"
    elif (cols[2] in mouse_dict.keys()) and (cols[0] not in mouse_dict[cols[2]]): #if key is there, but the value is not, then append it
        mouse_dict[cols[2]].append(cols[0])
        print "key:", cols[2], "is there but value", cols[0], "is not"
        print "Values for", cols[2], "are now:", mouse_dict[cols[2]]
print "My final dictionary items are:", mouse_dict.items()

我在我的大文件上测试了它并且它有效。如果有人有任何其他建议,请告诉我。我当之无愧地竖起大拇指:D。