Question

我有一个班级

其实例具有容器属性
- 本身包含容器，每个容器包含许多项目
对这些容器进行了昂贵的初始化

我想创建实例的副本

复制容器属性，而不是作为引用共享，但
每个容器中的容器不会被深度复制，而是共享引用
尽可能避免调用该类昂贵的__init__()方法

例如，让我们使用下面的类SetDict，在创建实例时，将类字典数据结构初始化为属性d。 d将整数存储为键并设置为值。

import collections

class SetDict(object):
    def __init__(self, size):
        self.d = collections.defaultdict(set)
        # Do some initialization; if size is large, this is expensive
        for i in range(size):
            self.d[i].add(1)

我想复制SetDict的实例，以便d本身被复制，但作为其值的集合不深度复制，而是只引用集合。

例如，请考虑此类当前的以下行为，其中copy.copy不会将属性d复制到新副本，但copy.deepcopy会创建集的全新副本这是d的值。

>>> import copy
>>> s = SetDict(3)
>>> s.d
defaultdict(<type 'set'>, {0: set([1]), 1: set([1]), 2: set([1])})
>>> # Try a basic copy
>>> t = copy.copy(s)
>>> # Add a new key, value pair in t.d
>>> t.d[3] = set([2])
>>> t.d
defaultdict(<type 'set'>, {0: set([1]), 1: set([1]), 2: set([1]), 3: set([2])})
>>> # But oh no! We unintentionally also added the new key to s.d!
>>> s.d
defaultdict(<type 'set'>, {0: set([1]), 1: set([1]), 2: set([1]), 3: set([2])})
>>> 
>>> s = SetDict(3)
>>> # Try a deep copy
>>> u = copy.deepcopy(s)
>>> u.d[0].add(2)
>>> u.d[0]
set([1, 2])
>>> # But oh no! 2 didn't get added to s.d[0]'s set
>>> s.d[0]
set([1])

我希望看到的行为如下：

>>> s = SetDict(3)
>>> s.d
defaultdict(<type 'set'>, {0: set([1]), 1: set([1]), 2: set([1])})
>>> t = copy.copy(s)
>>> # Add a new key, value pair in t.d
>>> t.d[3] = set([2])
>>> t.d
defaultdict(<type 'set'>, {0: set([1]), 1: set([1]), 2: set([1]), 3: set([2])})
>>> # s.d retains the same key-value pairs
>>> s.d
defaultdict(<type 'set'>, {0: set([1]), 1: set([1]), 2: set([1])})
>>> t.d[0].add(2)
>>> t.d[0]
set([1, 2])
>>> # s.d[0] also had 2 added to its set
>>> s.d[0]
set([1, 2])

这是我第一次尝试创建一个可以执行此操作的类，但由于无限递归而失败：

class CopiableSetDict(SetDict):
    def __copy__(self):
        import copy
        # This version gives infinite recursion, but conveys what we
        # intend to do.
        #
        # First, create a shallow copy of this instance
        other = copy.copy(self)
        # Then create a separate shallow copy of the d
        # attribute
        other.d = copy.copy(self.d)
        return other

我不确定如何正确覆盖copy.copy（或copy.deepcopy）行为来实现此目的。我也不完全确定我是否应该覆盖copy.copy或copy.deepcopy。我怎样才能获得所需的复制行为？

Answer 1

一个类是可调用的。当您致电SetDict(3)时，SetDict.__call__首先调用构造函数SetDict.__new__(SetDict)，然后在__init__(3)的返回值上调用初始值设定项__new__，如果它是{SetDict的实例1}}。因此，您可以通过直接调用其构造函数来获取SetDict（或任何其他类）的新实例，而无需调用其初始化程序。

之后，您有一个类型的实例，您只需添加任何容器属性的常规副本并将其返回。这样的事情可以解决问题。

import collections
import copy

class SetDict(object):
    def __init__(self, size):
        self.d = collections.defaultdict(set)
        # Do some initialization; if size is large, this is expensive
        for i in range(size):
            self.d[i].add(1)

    def __copy__(self):
        other = SetDict.__new__(SetDict) 
        other.d = self.d.copy()
        return other

__new__是一个静态方法，需要将类构造为其第一个参数。它应该像这样简单，除非你覆盖__new__做某事，在这种情况下你应该显示它是什么，以便可以修改它。这是测试代码确实展示了你想要的行为。

t = SetDict(3)
print t.d  # defaultdict(<type 'set'>, {0: set([1]), 1: set([1]), 2: set([1])})

s = copy.copy(t)
print s.d # defaultdict(<type 'set'>, {0: set([1]), 1: set([1]), 2: set([1])})

t.d[3].add(1)
print t.d # defaultdict(<type 'set'>, {0: set([1]), 1: set([1]), 2: set([1]), 3: set([1])})
print s.d # defaultdict(<type 'set'>, {0: set([1]), 1: set([1]), 2: set([1])})

s.d[0].add(2)
print t.d[0] # set([1, 2])
print s.d[0] # set([1, 2])

Answer 2

另一种选择是让__init__方法采用默认参数copying=False。如果复制是True，它可能会返回。这就像是

class Foo(object):
    def __init__(self, value, copying=False):
        if copying:
            return
        self.value = value

    def __copy__(self):
       other = Foo(0, copying=True)
       other.value = self.value
       return other

我不喜欢这个，因为当你制作副本时我必须将伪参数传递给__init__方法，我喜欢使用__init__方法，其唯一目的是初始化一个实例，而不是决定应该或不应该初始化一个实例。

Answer 3

根据aaronsterling的解决方案，如果有与实例相关的其他属性，我会编写以下内容，我认为更灵活：

class CopiableSetDict(SetDict):
    def __copy__(self):
        # Create an uninitialized instance
        other = self.__class__.__new__(self.__class__)
        # Give it the same attributes (references)
        other.__dict__ = self.__dict__.copy()
        # Create a copy of d dict so other can have its own
        other.d = self.d.copy()
        return other

具有容器容器作为属性的实例的有限深层副本

3 个答案: