我有一个包含多列的文件
即IP(在column0处),ID(在column2处)。
我想比较这两列,并希望按行获取IP相对应的唯一ID的结果
即(IP:id1,id2,id3 ...)
我得到的是结果,但不是逐行而是顺序地。此外,对于某些IP,我会一次又一次地获得相同的值,即(IP:id1,id2,id3,id1,id2,id3,id1,id2,id3 ...)。
示例文件:
ip, date,id,num1,num2
1xx.100.119.114,2018-09- 09T12:00:00+0900,05b8a163c1e482227fddcfcf2ef9d0f1,0,1
1xx.103.183.254,2018-08- 20T02:44:00+0900,07ed70b09b70b13ba08236ccb88f44a8,1,0
1xx.107.19.222,2018-08-22T05:08:00+0900,d5d673f8ee6b3fe1b8f796e0abdad96d,1,0
1xx.109.69.211,2018-08-10T18:55:00+0900,6a751d871f54ab01a72ef3edd07cb8d4,1,0
1xx.111.249.236,2018-09-15T04:03:00+0900,a5ec6954f3a7bf8bbc9e2077c24075ca,0,0
1xx.116.117.193,2018-09-29T13:45:00+0900,aec55760152a450d9aac31046a5b0677,1,0
1xx.119.4.181,2018-08-22T04:38:00+0900,-,-,-
我首先开始执行for循环,并将该行分成字符串。然后将ID与IP进行比较,如果符合条件,则将其存储在默认字典中。
import json
import re
from collections import defaultdict
from collections import Counter
import collections
# making the list of lines from the text file
f1_lines=[]
with open("clust1vsSoftbankUID.txt") as f1:
f1_lines=f1.readlines()
# storing IP vs UID in dict format
d = defaultdict(list)
for ln1 in f1_lines:
pieces1=ln1.split(",")
for ln2 in f1_lines:
pieces2=ln2.split(",")
if pieces2[2] == pieces1[2]:
d[pieces2[2]].append(pieces2[0])
print(d)
实际结果是:
'4a862bad794926595a85d3bde74f0de2': ['1xx.87.43.12','1xx.87.46.95', '1xx.87.48.107', '1xx.87.48.216', '1xx.87.50.70', '1xx.87.43.12', '1xx.87.46.95', '1xx.87.48.107', '1xx.87.48.216', '1xx.87.50.70', '1xx.87.43.12', '1xx.87.46.95', '1xx.87.48.107', '1xx.87.48.216', '1xx.87.50.70', '1xx.87.43.12', '1xx.87.46.95', '1xx.87.48.107', '1xx.87.48.216', '1xx.87.50.70', '1xx.87.43.12', '1xx.87.46.95', '1xx.87.48.107', '1xx.87.48.216', '1xx.87.50.70']
预期结果:
'4a862bad794926595a85d3bde74f0de2': ['1xx.87.43.12', '1xx.87.46.95', '1xx.87.48.107', '1xx.87.48.216', '1xx.87.50.70', ]
因此,不应有重复的IP w.r.t密钥ID值。
实际上它应该像(IP:id1,id2,id3,...)。
但是我得到的就是(IP:id1,id2,id3,id1,id2,id3,id1,id2,id3 ...)
第二,我得到的值不是按行显示。我希望像
那样按行显示'4a862bad794926595a85d3bde74f0de2':
'1xx.87.43.12',
'1xx.87.46.95',
'1xx.87.48.107',
'1xx.87.48.216',
'1xx.87.50.70',