我有两本词典:
d1 = {'100.1125': '353.2216'; '151.0977': '131.2193'; '102.0553': '103.6859'; '103.0209': '104.624'}
d2 = {'100.1124': '352.2220'; '200': '131.2193'; '300': '103.6859'; '400': '104.624'; '103.0545': '448.3161'}
我想循环浏览d1
中的密钥并检查它们在d2
中的存在,+ / - 指定的容差。如果为true,我希望比较每个字典键的关联值,并检查它们是否也与给定容差匹配。如果找到匹配项,我想将输出写入一个文件(Output_match.txt
)。如果找不到匹配项,我想将d1
密钥和相关值写入第二个文件(Output_nomatch.txt
)。
因此,我们说字典密钥比较的容差是+/- 0.0002
。价值比较的容差为+/- 5
。
我希望Output_match.txt
包含:
key ---- value
100.1125 ---- 353.2216
我希望Output_nomatch.txt
包含:
key ---- value
151.0977 ---- 131.2193
102.0553 ---- 103.6859
103.0209 ---- 104.624
有人可以在这里提供任何帮助吗?
编辑:
抱歉没有提供我当前的尝试:
with open(os.path.join(path,'out_true.txt'), 'w') as opt_true, open(os.path.join(path,'out_false.txt'), 'w') as opt_false:
header = ('%s\t%s\t%s\t%s\n') % ('file1_mz', 'file1_rt', 'file2_mz', 'file2_rt')
opt_true.write(header)
for key in d1.keys():
upper_mz = float(key) + (float(key) * (ppm*0.000001))
lower_mz = float(key) - (float(key) * (ppm*0.000001))
upper_rt = float(dict1[key]) + (2*rt_shift)
lower_rt = float(dict1[key]) - (2*rt_shift)
for key2 in d2:
upper_mz2 = float(key2) + (float(key2) * (ppm*0.000001))
lower_mz2 = float(key2) - (float(key2) * (ppm*0.000001))
upper_rt2 = float(dict2[key2]) + (2*rt_shift)
lower_rt2 = float(dict2[key2]) - (2*rt_shift)
if float(upper_mz) >= float(lower_mz2) and float(lower_mz) <= float(upper_mz2) and float(upper_rt) >= float(lower_rt2) and float(lower_rt) <= float(upper_rt2):
opt_true.write(('%s\t%s\t%s\t%s\n') % (str(key), str(dict1[key]), str(key2), str(dict2[key2])))
#print float(key)
a[key] = d1[key]
b[key2] = d2[key2]
for key in d1.keys():
if not key in a:
opt_false.write(('%s\t%s\n') % (str(key), d1[key]))
for key2 in d2.keys():
if not key2 in b:
opt_false.write(('%s\t%s\n') % (str(key), d1[key]))
opt_true.close()
opt_false.close()
答案 0 :(得分:1)
您可以使用内置函数abs
尝试这样的事情,以便以简单的方式计算距离:
d1 = {'100.1125': '353.2216', '151.0977': '131.2193',
'102.0553': '103.6859', '103.0209': '104.624'}
d2 = {'100.1124': '352.2220', '200': '131.2193',
'300': '103.6859', '400': '104.624', '103.0545': '448.3161'}
key_tolerance, value_tolerance = 0.0002, 5
output_match, output_nomatch = [], []
for i, j in d1.items():
for k, l in d2.items():
if (abs(float(i)-float(k)) <= key_tolerance and
abs(float(j)-float(l)) <= value_tolerance):
output_match.append((i, j))
else:
output_nomatch.append((i, j))
print(output_match, '----', set(output_nomatch) - set(output_match), sep='\n')
输出:
[('100.1125', '353.2216')]
----
{('102.0553', '103.6859'),
('103.0209', '104.624'),
('151.0977', '131.2193')}