如何为字典中的单个键存储多个唯一值

时间:2019-05-14 05:50:23

标签: python-3.x dictionary

我有一个包含多列的文件

即IP(在column0处),ID(在column2处)。

我想比较这两列,并希望按行获取IP相对应的唯一ID的结果

即(IP:id1,id2,id3 ...)

我得到的是结果,但不是逐行而是顺序地。此外,对于某些IP,我会一次又一次地获得相同的值,即(IP:id1,id2,id3,id1,id2,id3,id1,id2,id3 ...)。

示例文件:

    ip, date,id,num1,num2
    1xx.100.119.114,2018-09-           09T12:00:00+0900,05b8a163c1e482227fddcfcf2ef9d0f1,0,1
    1xx.103.183.254,2018-08-  20T02:44:00+0900,07ed70b09b70b13ba08236ccb88f44a8,1,0
    1xx.107.19.222,2018-08-22T05:08:00+0900,d5d673f8ee6b3fe1b8f796e0abdad96d,1,0
    1xx.109.69.211,2018-08-10T18:55:00+0900,6a751d871f54ab01a72ef3edd07cb8d4,1,0
    1xx.111.249.236,2018-09-15T04:03:00+0900,a5ec6954f3a7bf8bbc9e2077c24075ca,0,0
    1xx.116.117.193,2018-09-29T13:45:00+0900,aec55760152a450d9aac31046a5b0677,1,0
    1xx.119.4.181,2018-08-22T04:38:00+0900,-,-,-

我首先开始执行for循环,并将该行分成字符串。然后将ID与IP进行比较,如果符合条件,则将其存储在默认字典中。

    import json
    import re
    from collections import defaultdict
    from collections import Counter
    import collections

    # making the list of lines from the text file
    f1_lines=[]
    with open("clust1vsSoftbankUID.txt") as f1:
        f1_lines=f1.readlines()

    # storing IP vs UID in dict format

    d = defaultdict(list)      
    for ln1 in f1_lines:
        pieces1=ln1.split(",")
        for ln2 in f1_lines:
                pieces2=ln2.split(",")
                if pieces2[2] == pieces1[2]:
                        d[pieces2[2]].append(pieces2[0])
    print(d)                        

实际结果是:

'4a862bad794926595a85d3bde74f0de2': ['1xx.87.43.12','1xx.87.46.95', '1xx.87.48.107', '1xx.87.48.216', '1xx.87.50.70', '1xx.87.43.12', '1xx.87.46.95', '1xx.87.48.107', '1xx.87.48.216', '1xx.87.50.70', '1xx.87.43.12', '1xx.87.46.95', '1xx.87.48.107', '1xx.87.48.216', '1xx.87.50.70', '1xx.87.43.12', '1xx.87.46.95', '1xx.87.48.107', '1xx.87.48.216', '1xx.87.50.70', '1xx.87.43.12', '1xx.87.46.95', '1xx.87.48.107', '1xx.87.48.216', '1xx.87.50.70']

预期结果:

'4a862bad794926595a85d3bde74f0de2': ['1xx.87.43.12',  '1xx.87.46.95', '1xx.87.48.107', '1xx.87.48.216', '1xx.87.50.70', ]

因此,不应有重复的IP w.r.t密钥ID值。

实际上它应该像(IP:id1,id2,id3,...)。

但是我得到的就是(IP:id1,id2,id3,id1,id2,id3,id1,id2,id3 ...)

第二,我得到的值不是按行显示。我希望像

那样按行显示
'4a862bad794926595a85d3bde74f0de2': 
'1xx.87.43.12',        
'1xx.87.46.95', 
'1xx.87.48.107', 
'1xx.87.48.216', 
'1xx.87.50.70',

0 个答案:

没有答案