Question

对于像x = {'a': 1, 'b': 2}这样的单层字符串，这个问题很简单，并且在SO（Pythonic way to check if two dictionaries have the identical set of keys?）上得到了回答，但是嵌套的dicts呢？

例如，y = {'a': {'c': 3}, 'b': {'d': 4}}包含密钥'a'和'b'，但我想将其形状与另一个嵌套的dict结构进行比较，例如z = {'a': {'c': 5}, 'b': {'d': 6}}具有相同的形状和键（ y为w = {'a': {'c': 3}, 'b': {'e': 4}}。 'a'会有密钥'b'和y，但其中的下一层与w['b']不同，因为'e'具有密钥y['b']而'd'密钥dict_1。

想要两个参数dict_2和True的简短函数，如果它们具有与上述相同的形状和键，则返回False，否则{{1}}。 / p>

Answer 1

这提供了两个字典的副本，其中包含任何非字典值，然后对它们进行比较：

def getshape(d):
    if isinstance(d, dict):
        return {k:getshape(d[k]) for k in d}
    else:
        # Replace all non-dict values with None.
        return None

def shape_equal(d1, d2):
    return getshape(d1) == getshape(d2)

Answer 2

我喜欢nneonneo的答案，它应该相对较快，但我想要一些没有创建额外不必要的数据结构的东西（我已经在学习Python中的内存碎片）。这可能会也可能不会更快或更快。

（编辑：剧透！）

通过足够好的余量使其在所有情况下都更受欢迎，请参阅其他分析答案。

但是如果处理很多这些并且存在内存问题，那么最好这样做。

实施

这应该适用于Python 3，如果你将keys翻译成viewkeys，则可能是2.7，绝对不是2.6。它依赖于dicts具有的键的集合视图：

def sameshape(d1, d2):
    if isinstance(d1, dict):
        if isinstance(d2, dict):
            # then we have shapes to check
            return (d1.keys() == d2.keys() and
                    # so the keys are all the same
                    all(sameshape(d1[k], d2[k]) for k in d1.keys()))
                    # thus all values will be tested in the same way.
        else:
            return False # d1 is a dict, but d2 isn't
    else:
        return not isinstance(d2, dict) # if d2 is a dict, False, else True.

编辑已更新，以减少冗余类型检查，现在效率更高。

测试

检查：

print('expect false:')
print(sameshape({'foo':{'bar':{None:None}}}, {'foo':{'bar':{None: {} }}}))
print('expect true:')
print(sameshape({'foo':{'bar':{None:None}}}, {'foo':{'bar':{None:'foo'}}}))
print('expect false:')
print(sameshape({'foo':{'bar':{None:None}}}, {'foo':{'bar':{None:None, 'baz':'foo'}}}))

打印：

expect false:
False
expect true:
True
expect false:
False

Answer 3

要分析当前存在的两个答案，请先导入timeit：

import timeit

现在我们需要设置代码：

setup = '''
import copy

def getshape(d):
    if isinstance(d, dict):
        return {k:getshape(d[k]) for k in d}
    else:
        # Replace all non-dict values with None.
        return None

def nneo_shape_equal(d1, d2):
    return getshape(d1) == getshape(d2)

def aaron_shape_equal(d1,d2):
    if isinstance(d1, dict) and isinstance(d2, dict):
        return (d1.keys() == d2.keys() and 
                all(aaron_shape_equal(d1[k], d2[k]) for k in d1.keys()))
    else:
        return not (isinstance(d1, dict) or isinstance(d2, dict))

class Vividict(dict):
    def __missing__(self, key):
        value = self[key] = type(self)()
        return value

d = Vividict()

d['foo']['bar']
d['foo']['baz']
d['fizz']['buzz']
d['primary']['secondary']['tertiary']['quaternary']

d0 = copy.deepcopy(d)
d1 = copy.deepcopy(d)
d1['primary']['secondary']['tertiary']['extra']
# d == d0 is True
# d == d1 is now False!
'''

现在让我们测试两个选项，首先使用Python 3.3！

>>> timeit.repeat('nneo_shape_equal(d0, d); nneo_shape_equal(d1,d)', setup=setup)
[36.784881490981206, 36.212246977956966, 36.29759863798972]

看起来我的解决方案需要2/3到3/4的时间，速度超过1.25倍。

>>> timeit.repeat('aaron_shape_equal(d0, d); aaron_shape_equal(d1,d)', setup=setup)
[26.838892214931548, 26.61037168605253, 27.170253590098582]

在我自己编译的Python 3.4版本（alpha）中：

>>> timeit.repeat('nneo_shape_equal(d0, d); nneo_shape_equal(d1,d)', setup=setup)
[272.5629618819803, 273.49581588001456, 270.13374400604516]
>>> timeit.repeat('aaron_shape_equal(d0, d); aaron_shape_equal(d1,d)', setup=setup)
[214.87033835891634, 215.69223327597138, 214.85333003790583]

仍然大致相同的比例。两者之间的时差可能是因为我在没有优化的情况下自编译了3.4。

感谢所有读者！

检查Python dicts是否具有相同的形状和键

3 个答案:

实施

测试