Question

我的问题与unittest - compare list irrespective of order类似，但略有不同：

我有一个返回列表列表的函数，没有任何保证的顺序（对于两个级别）。我正在将其输出与某个预定值“这应该是答案”进行比较以对其进行测试。因此，如果should_be_the_answer == [[1,2], [3,4]]，则应传递以下返回值：

[[1,2], [3,4]]-[[3,4], [1,2]]-[[2,1], [4,3]]-[[3,4], [2,1]]（还有更多变体）

但是在混合时不是，所以[[1,3], [2,4]]应该会失败。

assertCountEqual将不起作用，因为它将比较子列表（将第一个列表与最后一个列表进行比较将告诉我[1,2]不在最后一个列表中）。我所有的值都是唯一的，但是它们不是int而是dict，因此转换为集合很尴尬。

编辑：关于我要比较的一些示例。它们是电话号码，由于呼叫国家/地区的不确定性，因此具有不同的解释：

[{'source': '001123456789',
  'interpretations': [
     {'prefix': '00',
      'country_code': '1',
      'national_part': '123456789'},
     {'prefix': '0011',
      'country_code': '234',
      'national_part': '56789'}]
 }, {'source': '0011987654321',
  'interpretations': [
     {'prefix': '00',
      'country_code': '1',
      'national_part': '1987654321'},
     {'prefix': '0011',
      'country_code': '98',
      'national_part': '7654321'}]
}]

比较源部分不是问题，但是有不同的解释。

解决这个问题的最佳方法是什么？

我已经提出了一些解决方案，但是它们都没有一个感觉很pythonic，可能效率很低：

将子列表变成集合，使用字典的repr使之成为可能。但是将它们转换为字符串以进行比较是错误的。如果dict的顺序更改，可能会导致错误。
对列表和/或子列表进行排序。意思是要为字典提供排序/比较功能，这可能是过大了。
遍历子列表。对于每个值，请尝试是否有一个满足assertCountEqual的对应子列表。可以，但是当输出变大时，这有可能变得非常昂贵（尽管可能对我的用例有用）

有人有更好的主意吗？

Answer 1

我试图实现您的需求。

tl; dr：您可以将{@ 1}的命令转换成冻结集，而不用repr()

expected1 = [[1, 2], [3, 4]]
data1 = [[2, 1], [4, 3]]

def unorder(data, is_dict=False):
    if is_dict:
        return set(frozenset(frozenset(d.items()) for d in sublist) for sublist in data) # d is dict here
    else:
        return set(frozenset(sublist) for sublist in data)


assert unorder(data1) == unorder(expected1)
d1 = {'a': 1}
d2 = {'b': 2, 'x': 12}
d3 = {'c': 3}
d4 = {'d': 4, 'y': -23}
expected2 = [[d1, d2], [d3, d4]]
data2 = [[d2, d1], [d4, d3]]

assert unorder(data2, is_dict=True) == unorder(expected2, is_dict=True)

我使用Frozensets是因为您可以将Frozensets添加到另一个sets中不变的。

Answer 2

我认为排序不会太过分。可以很容易做到：

def mysort(l):
    l2 = sorted(l, key=lambda x: x["source"])
    for entry in l2:
        entry["interpretations"].sort(key=lambda x: x["prefix"])
    return l2

assert mysort(a) == mysort(b)

当前，它仅使用prefix作为排序键，可用于示例数据，但对于实际数据可能还不够。
在这种情况下，只需使用f"{x['prefix']}-{x['country_code']}"之类的组合键即可。

性能，仅使用解释部分：test=example_data[0]["interpretations"]：

# Creating a set from the list and frozensets from the dicts, as suggested in another answer:
timeit.timeit(lambda test=test: set(frozenset(d.items()) for d in test), number=1000000)
1.0023461619994123

# Sorting by single custom key:
timeit.timeit(lambda test=test: test.sort(key=lambda x: x["prefix"]), number=1000000)
0.32801106399983837

# Sorting by combined custom key:
timeit.timeit(lambda test=test: test.sort(key=lambda x: f"{x['prefix']}-{x['country_code']}"), number=1000000)
0.5157679960002497

单元测试-不按顺序比较列表清单

2 个答案: