我正在比较包含多达100,000个结果的服务器文件的字典。
我已经从我的服务器中捕获了一个列表文件,并且我已经将它们读入我的程序中的字典。键值是md5哈希值,v值是路径(即/usr/john/upstart.exe)。
我的词典名为#
和firstServ
。
我需要找出:
基本上,我只需要知道如何进行这些比较。感谢您的任何意见。
答案 0 :(得分:0)
有101种方法可以做到这一点。
也许将你的密钥,值对加载到这样的东西:
x = { k: [v1,v2] }
然后,您将把firstServ和secondServ中的数据按哈希分组。只需循环遍历字典,找到不同的地方。
答案 1 :(得分:0)
使用list comprehension和简单的for循环,你可以做到:
鉴于这些词典:
firstServ = {"md5Hash1":"path1", "md5Hash2":"path2", "md5Hash3":"path3"}
secondServ = {"md5Hash1":"path4", "md5Hash4":"path2", "md5Hash5":"path5"}
提取密钥:
firstServKeys = set(firstServ.keys())
secondServKeys = set(secondServ.keys())
firstServ独有的密钥
从firstServKeys
中减去secondServKeysuniqueKeysInFirstServ = firstServKeys.difference(secondServKeys)
secondServ独有的密钥
从secondServKeys
中减去firstServKeysuniqueKeysInSecondServ = secondServKeys.difference(firstServKeys)
两个字典中包含不同键的值
将{hash:path}中的两个映射转换为{path:hash},然后转换为inv_festServ中的所有路径,查找inv_festServ是否也包含它,并保持它们的散列不同。
inv_festServ = {v: k for k, v in firstServ.items()} # Inverting keys and values
inv_secondServ = {v: k for k, v in secondServ.items()} # Inverting keys and values
valuesWithDifferentkeys = [v for v in list(inv_festServ.keys())
if v in inv_secondServ.keys() and inv_secondServ[v] != inv_festServ[v]]
两个字典中包含不同值的键
将firstServKeys和secondServKeys相交以拥有所有公共密钥,这样我们就可以在更小的集合上工作 然后,对于每个键,保持值是否不同
keysWithDifferentValues = [k for k in firstServKeys.intersection(secondServKeys)
if firstServ[k] != secondServ[k]]
打印出来:
print("Keys unique to first server:")
print(uniqueKeysInFirstServ)
print("Keys unique to second server:")
print(uniqueKeysInSecondServ)
print("Values present in both servers but with a different key:")
print(valuesWithDifferentkeys)
print("Keys present in both servers but with a different value:")
print(keysWithDifferentValues)
输出
Keys unique to first server:
{'md5Hash2', 'md5Hash3'}
Keys unique to second server:
{'md5Hash5', 'md5Hash4'}
Values present in both servers but with a different key:
['path2']
Keys present in both servers but with a different value:
['md5Hash1']
答案 2 :(得分:0)
一个非常简单的想法如下;
FirstSet = {"1":"C:/", "2":"C:/Windows", "3":"C:/Users","4":"C:/Something"}
SecondSet = {"10":"E:/", "20":"C:/", "30":"C:/Users"}
Differences = []
for i in FirstSet.keys():
if(FirstSet[i] not in SecondSet.values()):
Differences.append((FirstSet[i],"FirstSet"))
for i in SecondSet.keys():
if(SecondSet[i] not in FirstSet.values()):
Differences.append((SecondSet[i],"SecondSet"))
for i in Differences:
print("Only set {} has the {} element.".format(i[1],i[0]))