我对编程很陌生,我想比较python中的两个列表列表,而这些列表中的浮点数可能有错误。这是一个例子:
first_list = [['ATOM', 'N', 'SER', -1.081, -16.465, 17.224],
['ATOM', 'C', 'SER', 2.805, -3.504, 6.222],
['ATOM', 'O', 'SER', -17.749, 16.241, -1.333]]
secnd_list = [['ATOM', 'N', 'SER', -1.082, -16.465, 17.227],
['ATOM', 'C', 'SER', 2.142, -3.914, 6.222],
['ATOM', 'O', 'SER', -17.541, -16.241, -1.334]]
预期产出:
Differences = ['ATOM', 'C', 'SER', 2.805, -3.504, 6.222]
到目前为止我的尝试:
def aprox (x, y):
if x == float and y == float:
delta = 0.2 >= abs(x - y)
return delta
else: rest = x, y
return rest
def compare (data1, data2):
diff = [x for x,y in first_list if x not in secnd_list and aprox(x,y)] + [x for x,y in secnd_list if x not in first_list and aprox(x,y)]
return diff
或者在元组的帮助下,但在那里我不知道如何构建近似值:
def compare (data1, data2):
first_set = set(map(tuple, data1))
secnd_set = set(map(tuple, data2))
diff = first_set.symmetric_difference(secnd_set)
return diff
希望你能帮助我! :)
答案 0 :(得分:4)
该行
if x == float and y == float
不准确......
检查变量类型的正确方法是使用type()
函数...
尝试用
if type(x) is float and type(y) is float:
答案 1 :(得分:0)
这有点笨重但是我在飞行中做了它,它应该能得到你想要的结果。正如我在您的代码中提到的那样,您将阈值设置为0.2
,这意味着应该返回两行,而不是像您提到的那样。
def discrepancies(x, y):
for _, (row1, row2) in enumerate(zip(x, y)):
for _, (item1, item2) in enumerate(zip(row1[3:],row2[3:])):
if abs(item1 - item2) >= 0.2:
print row1
break
discrepancies(first_list, secnd_list)
['ATOM', 'C', 'SER', 2.805, -3.504, 6.222]
['ATOM', 'O', 'SER', -17.749, 16.241, -1.333]
一些注意事项,这将变得相当慢,因为每个for循环添加O(n),对于列表中的较大列表,我将使用我认为它被调用的itertools.izip
函数。希望这有帮助!
答案 2 :(得分:0)
可能您可以遍历两者中的每个元素,然后比较子元素:
然后,当任何子元素不相等时,可以根据它的类型将其添加到结果中,即如果是两个
字符串不相等,可以添加到结果中,如果它是浮点数,math.isclose()
可用于近似:
注意:进行了更正以匹配预期的输出,first_list
的第三个元素中缺少负号
import math
first_list = [['ATOM', 'N', 'SER', -1.081, -16.465, 17.224],
['ATOM', 'C', 'SER', 2.805, -3.504, 6.222],
['ATOM', 'O', 'SER', -17.749, -16.241, -1.333]] # changes made
secnd_list = [['ATOM', 'N', 'SER', -1.082, -16.465, 17.227],
['ATOM', 'C', 'SER', 2.142, -3.914, 6.222],
['ATOM', 'O', 'SER', -17.541, -16.241, -1.334]]
diff = []
for e1, e2 in zip(first_list, secnd_list):
for e_sub1, e_sub2 in zip(e1, e2):
# if sub-elements are not equal
if e_sub1 != e_sub2:
# if it is string and not equal
if isinstance(e_sub1, str):
diff.append(e1)
break # one element not equal so no need to iterate other sub-elements
else: # is float and not equal
# Comparison made to 0.2
if not math.isclose(e_sub1, e_sub2, rel_tol=2e-1):
diff.append(e1)
break # one element not equal so no need to iterate other sub-elements
diff
输出:
[['ATOM', 'C', 'SER', 2.805, -3.504, 6.222]]