我有此代码:
import resource
from copy import deepcopy
import pandas as pd
import random
n=100000
df = pd.DataFrame({'x': [random.randint(1, 1000) for i in range(n)],
'y': [random.randint(1, 1000) for i in range(n)],
'z': [random.randint(1, 1000) for i in range(n)]
})
df2=deepcopy(df)
print 'memory peak: {kb} kb'.format(kb=resource.getrusage(resource.RUSAGE_SELF).ru_maxrss)
我希望在df2=deepcopy(df)
行和没有行的情况下,此代码会看到不同的内存峰值用法。但是内存显示的结果完全相同。
deepcopy
是否应该克隆对象并因此增加内存使用量?
答案 0 :(得分:3)
在对象copy.deepcopy
上调用foo
时,将在其__dict__
上查找__deepcopy__
方法,该方法依次被调用。
对于DataFrame
实例,它扩展了NDFrame
类,该类的__deepcopy__
方法将操作委托给NDFrame.copy
。
这里是the implementation of NDFrame.__deepcopy__
:
def __deepcopy__(self, memo=None):
"""
Parameters
----------
memo, default None
Standard signature. Unused
"""
if memo is None:
memo = {}
return self.copy(deep=True)
您可以阅读NDFrame.copy
的文档字符串here,但这是主要摘录:
def copy(self, deep=True):
"""
Make a copy of this object's indices and data.
When ``deep=True`` (default), a new object will be created with a
copy of the calling object's data and indices. Modifications to
the data or indices of the copy will not be reflected in the
original object (see notes below).
(...)
Notes
-----
When ``deep=True``, data is copied but actual Python objects
will not be copied recursively, only the reference to the object.
This is in contrast to `copy.deepcopy` in the Standard Library,
which recursively copies object data (see examples below).
在您的示例中,数据是Python列表,因此实际上并未复制。