字典键与公差的比较

时间:2015-04-08 13:16:35

标签: python dictionary

我有两本词典:

d1 = {'100.1125': '353.2216'; '151.0977': '131.2193'; '102.0553': '103.6859'; '103.0209': '104.624'}

d2 = {'100.1124': '352.2220'; '200': '131.2193'; '300': '103.6859'; '400': '104.624'; '103.0545': '448.3161'}

我想循环浏览d1中的密钥并检查它们在d2中的存在,+ / - 指定的容差。如果为true,我希望比较每个字典键的关联值,并检查它们是否也与给定容差匹配。如果找到匹配项,我想将输出写入一个文件(Output_match.txt)。如果找不到匹配项,我想将d1密钥和相关值写入第二个文件(Output_nomatch.txt)。

因此,我们说字典密钥比较的容差是+/- 0.0002。价值比较的容差为+/- 5

我希望Output_match.txt包含:

key ---- value
100.1125 ---- 353.2216

我希望Output_nomatch.txt包含:

key ---- value
151.0977 ---- 131.2193
102.0553 ---- 103.6859
103.0209 ---- 104.624

有人可以在这里提供任何帮助吗?

编辑:

抱歉没有提供我当前的尝试:

with open(os.path.join(path,'out_true.txt'), 'w') as opt_true, open(os.path.join(path,'out_false.txt'), 'w') as opt_false:
header = ('%s\t%s\t%s\t%s\n') % ('file1_mz', 'file1_rt', 'file2_mz', 'file2_rt')
opt_true.write(header)
for key in d1.keys():
    upper_mz = float(key) + (float(key) * (ppm*0.000001))
    lower_mz = float(key) - (float(key) * (ppm*0.000001))
    upper_rt = float(dict1[key]) + (2*rt_shift)
    lower_rt = float(dict1[key]) - (2*rt_shift)
    for key2 in d2:
        upper_mz2 = float(key2) + (float(key2) * (ppm*0.000001))
        lower_mz2 = float(key2) - (float(key2) * (ppm*0.000001))
        upper_rt2 = float(dict2[key2]) + (2*rt_shift)
        lower_rt2 = float(dict2[key2]) - (2*rt_shift)
        if float(upper_mz) >= float(lower_mz2) and float(lower_mz) <= float(upper_mz2) and float(upper_rt) >= float(lower_rt2) and float(lower_rt) <= float(upper_rt2):
            opt_true.write(('%s\t%s\t%s\t%s\n') % (str(key), str(dict1[key]), str(key2), str(dict2[key2])))
            #print float(key)
            a[key] = d1[key]
            b[key2] = d2[key2]
for key in d1.keys():
    if not key in a:
        opt_false.write(('%s\t%s\n') % (str(key), d1[key]))
for key2 in d2.keys():
    if not key2 in b:
        opt_false.write(('%s\t%s\n') % (str(key), d1[key]))

opt_true.close()
opt_false.close()

1 个答案:

答案 0 :(得分:1)

您可以使用内置函数abs尝试这样的事情,以便以简单的方式计算距离:

d1 = {'100.1125': '353.2216', '151.0977': '131.2193',
      '102.0553': '103.6859', '103.0209': '104.624'}
d2 = {'100.1124': '352.2220', '200': '131.2193',
      '300': '103.6859', '400': '104.624', '103.0545': '448.3161'}

key_tolerance, value_tolerance = 0.0002, 5
output_match, output_nomatch = [], []

for i, j in d1.items():
    for k, l in d2.items():
        if (abs(float(i)-float(k)) <= key_tolerance and
            abs(float(j)-float(l)) <= value_tolerance):
            output_match.append((i, j))
        else:
            output_nomatch.append((i, j))

print(output_match, '----', set(output_nomatch) - set(output_match), sep='\n')

输出:

[('100.1125', '353.2216')]
----
{('102.0553', '103.6859'), 
 ('103.0209', '104.624'), 
 ('151.0977', '131.2193')}