我写了以下代码:
from hurry.size import size
from pysize import get_zise
import os
import psutil
def load_objects():
process = psutil.Process(os.getpid())
print "start method"
process = psutil.Process(os.getpid())
print "process consumes " + size(process.memory_info().rss)
objects = make_a_call()
print "total size of objects is " + (get_size(objects))
print "process consumes " + size(process.memory_info().rss)
print "exit method"
def main():
process = psutil.Process(os.getpid())
print "process consumes " + size(process.memory_info().rss)
load_objects()
print "process consumes " + size(process.memory_info().rss)
get_size()
使用this代码返回对象的内存消耗。
我得到以下照片:
process consumes 21M
start method
total size of objects is 20M
process consumes 29M
exit method
process consumes 29M
答案 0 :(得分:3)
永远不会明确销毁对象;然而,当他们成为 无法到达,他们可能被垃圾收集。一个实现是 允许推迟垃圾收集或完全省略 - 这是一个 垃圾收集的实施质量问题 实现,只要没有收集任何仍然存在的对象 可到达的。
CPython实现细节:CPython目前使用的是 参考计数方案与(可选)延迟检测 循环链接垃圾,一旦收集到大多数物体 变得无法到达,但不保证收集垃圾 包含循环引用。请参阅gc模块的文档 有关控制循环垃圾收集的信息。其他 实现方式不同,CPython可能会改变。不依赖 当对象变得无法到达时立即完成对象(所以 你应该总是明确地关闭文件。)
答案 1 :(得分:3)
这是一个完全正常工作(python 2.7)的例子,它有同样的问题(为了简单起见,我稍微更新了原始代码)
from hurry.filesize import size
from pysize import get_size
import os
import psutil
def make_a_call():
return range(1000000)
def load_objects():
process = psutil.Process(os.getpid())
print "start method"
process = psutil.Process(os.getpid())
print"process consumes ", size(process.memory_info().rss)
objects = make_a_call()
# FIXME
print "total size of objects is ", size(get_size(objects))
print "process consumes ", size(process.memory_info().rss)
print "exit method"
def main():
process = psutil.Process(os.getpid())
print "process consumes " + size(process.memory_info().rss)
load_objects()
print "process consumes " + size(process.memory_info().rss)
main()
这是输出:
process consumes 7M
start method
process consumes 7M
total size of objects is 30M
process consumes 124M
exit method
process consumes 124M
差异 ~100Mb
以下是代码的固定版本:
from hurry.filesize import size
from pysize import get_size
import os
import psutil
def make_a_call():
return range(1000000)
def load_objects():
process = psutil.Process(os.getpid())
print "start method"
process = psutil.Process(os.getpid())
print"process consumes ", size(process.memory_info().rss)
objects = make_a_call()
print "process consumes ", size(process.memory_info().rss)
print "total size of objects is ", size(get_size(objects))
print "exit method"
def main():
process = psutil.Process(os.getpid())
print "process consumes " + size(process.memory_info().rss)
load_objects()
print "process consumes " + size(process.memory_info().rss)
main()
这是更新的输出:
process consumes 7M
start method
process consumes 7M
process consumes 38M
total size of objects is 30M
exit method
process consumes 124M
你发现了差异吗?您在测量最终过程大小之前计算对象大小,这会导致额外的内存消耗。 让我们来看看它为什么会发生 - 这里是它的来源 https://github.com/bosswissam/pysize/blob/master/pysize.py:
import sys
import inspect
def get_size(obj, seen=None):
"""Recursively finds size of objects in bytes"""
size = sys.getsizeof(obj)
if seen is None:
seen = set()
obj_id = id(obj)
if obj_id in seen:
return 0
# Important mark as seen *before* entering recursion to gracefully handle
# self-referential objects
seen.add(obj_id)
if hasattr(obj, '__dict__'):
for cls in obj.__class__.__mro__:
if '__dict__' in cls.__dict__:
d = cls.__dict__['__dict__']
if inspect.isgetsetdescriptor(d) or inspect.ismemberdescriptor(d):
size += get_size(obj.__dict__, seen)
break
if isinstance(obj, dict):
size += sum((get_size(v, seen) for v in obj.values()))
size += sum((get_size(k, seen) for k in obj.keys()))
elif hasattr(obj, '__iter__') and not isinstance(obj, (str, bytes, bytearray)):
size += sum((get_size(i, seen) for i in obj))
return size
这里发生了很多事情! 最值得注意的是它包含它在集合中看到的所有对象以解析循环引用。如果删除该行,则在任何一种情况下都不会获得那么多内存。
这是一个关于这个主题的好article,引用:
如果您创建一个大对象并再次删除它,Python可能会 释放了内存,但涉及的内存分配器却没有 必须将内存返回给操作系统,所以它可能看起来 好像Python进程使用的虚拟内存要多得多 实际使用。