Question

我有任意嵌套的容器对象（例如列表和词组）。

我想在调用函数后测试是否容器对象发生了变异。

>>> x = [[1,2,3], {1,2,3}, "other data", 1]
>>> non_mutating_func(x)
>>> x
[[1,2,3], {1,2,3}, "other data", 1] 
>>> mutating_func(x)
>>> x
[[100,2,3], {1,2,3}, "other data", 1] # One of the inner lists got changed. x got mutated.

我还要检查对象标识。这是我通过检查对象标识的意思的一个例子：

>>> a = [[1,2],1,2]
>>> def f(x):
...     x[0] = [1,2]
...
>>> b = a[0]
>>> f(a)
>>> b is a[0]
False

来自[1,2]的{{1}}列表被另一个列表a[0]取代，但列表是不同的对象。所以它算作变异。

注意：以前，对于非嵌套列表，我可以这样做：

[1,2]

此外，原始容器可能是dict而不是列表。

x = [1,2,3,4]
x_ori = x[:]
f(x)
mutated = False
if len(x) != len(x_ori):
    mutated = True
for i,j in zip(x, x_ori):
    if not (i is j):
        mutated = True
        break

嵌套容器有可能吗？如果是这样，我该怎么做？

Answer 1

棘手的一点是“同一个实例”检查。您可以递归地为整个结构创建哈希代码，或创建深层副本并比较两者，但两者都将无法通过“相同实例”检查。

您可以创建原始列表的副本，作为稍后的参考，但不止于此：您必须将结构中的每个元素与其原始id配对：

def backup(data):
    # similar for set, dict, tuples, etc.
    if isinstance(data, list):
        return id(data), [backup(x) for x in data]
    # basic immutable stuff, string, numbers, etc.
    return id(data), data

然后，您可以递归检查结构并比较所有ID 和以递归方式比较任何子结构的内容：

def check(backup, data):
    id_, copy = backup
    # check whether it's still the same instance
    if id_ != id(data):
        return False
    # similar for set, dict, tuples, etc.
    if isinstance(data, list):
        return len(data) == len(copy) and all(check(b, d) for b, d in zip(copy, data))
    # basic immutable stuff must be equal due to equal ID
    return True

以下是一个示例，以及一些示例修改：

data = [[1,2,3], [4, [5,6], [7,8]], 9]
b = backup(data)
# data[1][0] = 4        # check -> True, replaced with identical value
# data[1][1] = [5,6]    # check -> False, replaced with equal value
# data[1][1].append(10) # check -> False, original value modified
print(check(b, data))

当然，这两种方法都不完整，必须扩展到其他结构，例如dict，set，tuple等。对于set和dict，您可能希望比较sorted条目，但除此之外在性质上非常相似。

请注意，从技术上讲，保证不会修改列表，例如在具有该ID的原始对象被垃圾收集之后，ID可以被重用，但在一般情况下，上述应该可以工作。

Answer 2

有两种广泛的方法：事后验证，或防止变更操作发生。这是一个代理类的草图，它阻止访问__setitem__和类似的方法。

names = ['__setitem__', 'append', 'pop', 'add', 'remove', 'update']
class immutable_mixin:
    def __getattribute__(self, name):
        if name in names: raise TypeError
        return super().__getattribute__(name)
    def __getitem__(self, k): return wrap(super().__getitem__(k))
    def __iter__(self): return map(wrap, super().__iter__())
    def __repr__(self): return '>>{}<<'.format(super().__repr__())

class immutable_list(immutable_mixin, list): pass
class immutable_set(immutable_mixin, set): pass
class immutable_dict(immutable_mixin, dict): pass

def wrap(x):
    if isinstance(x, (int, str, bytes)): return x
    elif isinstance(x, list): return immutable_list(x)
    elif isinstance(x, set): return immutable_set(x)
    elif isinstance(x, dict): return immutable_dict(x)
    else: return 'FIXME' + repr(x)

简而言之，变异操作会引发TypeError，并且getter操作可确保返回的值被代理（或者是不能包含其他值的类型）。

>>> x = [[1,2,3], {1,2,3}, "other data", 1, {1:1, "2":"2"}]
>>> m = wrap(x)
>>> m
>>[[1, 2, 3], {1, 2, 3}, 'other data', 1, {1: 1, '2': '2'}]<<
>>> list(m)
[>>[1, 2, 3]<<, >>immutable_set({1, 2, 3})<<, 'other data', 1, >>{1: 1, '2': '2'}<<]

面对非标准容器（例如defaultdict），它可能很脆弱。它还需要全面工作 - 我忘了包括__delitem__和__reversed__，以及list.extend; set arithmetic也可以作为转义填充（但列表切片不会！）。见Python Data Model。列出允许的方法而不是不允许的方法可能更健壮，但代码会更长。

如何检查嵌套容器是否发生变异？

2 个答案: