我提出prior post关于一个问题,其中我在释放一些变量在python中使用的内存时遇到了一些问题。尝试了很多东西,最后我尝试在代码上使用内存分析器来查看内存的分配位置。
这是内存分析器的输出:
Line # Mem usage Increment Line Contents
================================================
30 28.523 MiB 0.000 MiB @profile
31 def store_file(file_name,data):
32 29.027 MiB 0.504 MiB with h5py.File(file_name,'w',driver='core',backing_store=False) as f:
33 28.520 MiB -0.508 MiB dset = f.create_dataset("mydataset", (100,) ,dtype='i')
Filename: array.py
Line # Mem usage Increment Line Contents
================================================
35 24.656 MiB 0.000 MiB @profile
36 def load_data_overlap(saved):
37 #os.chdir(numpy_train)
38 24.656 MiB 0.000 MiB print "Inside function!..."
39 24.656 MiB 0.000 MiB if saved == False:
40 25.531 MiB 0.875 MiB train_files = np.random.randint(255,size=(1,40,690,4))
41 25.531 MiB 0.000 MiB train_input_data_interweawed_normalized = []
42 #print "Storing train pic to numpy"
43 25.531 MiB 0.000 MiB part = 0
44 28.523 MiB 2.992 MiB for i in xrange(10):
45 28.523 MiB 0.000 MiB print resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
46 28.523 MiB 0.000 MiB if resource.getrusage(resource.RUSAGE_SELF).ru_maxrss > 27033600:
47 28.523 MiB 0.000 MiB print "Max ram storing part: " + str(part) + " At entry: " + str(i)
48 #print resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
49 #print "Storing Train input"
50 28.523 MiB 0.000 MiB file_name = 'train_input_'+'part_'+str(part)+'_'+str(dim)+'_'+str(total_frames_with_deltas)+'_window_height_'+str(window_height)+'.h5'
51 27.492 MiB -1.031 MiB store_file(file_name,train_input_data_interweawed_normalized)
52 #print getsizeof(train_input_data_interweawed_normalized)
53 #print resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
54 27.492 MiB 0.000 MiB part = part + 1
55 27.492 MiB 0.000 MiB del train_input_data_interweawed_normalized
56 27.492 MiB 0.000 MiB gc.collect()
57 #print resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
58 27.492 MiB 0.000 MiB train_input_data_interweawed_normalized = []
59 #print getsizeof(train_input_data_interweawed_normalized)
60 #print resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
61 #raw_input("something")
62 28.523 MiB 1.031 MiB for plot in train_files:
63 28.523 MiB 0.000 MiB overlaps_reshaped = np.random.randint(10,size=(45,200,5,3))
64 28.523 MiB 0.000 MiB for ind_plot in overlaps_reshaped.reshape(overlaps_reshaped.shape[1],overlaps_reshaped.shape[0],overlaps_reshaped.shape[2],overlaps_reshaped.shape[3]):
65 28.523 MiB 0.000 MiB ind_plot_reshaped = ind_plot.reshape(ind_plot.shape[0],1,ind_plot.shape[1],ind_plot.shape[2])
66 28.523 MiB 0.000 MiB train_input_data_interweawed_normalized.append(ind_plot_reshaped)
67 28.523 MiB 0.000 MiB print len(train_input_data_interweawed_normalized)
内存被分配到变量中,这些变量总是被清除,但ram的使用从未被释放..
然后,我尝试将内存使用情况随着时间的推移进行可视化,并看到了这一点:
似乎内存使用情况似乎没有返回..看起来python代码动态分配了必要的内存并且从未返回过它。 好像“代码思想” - 它需要那么多内存,因为变量之前使用了这么多内存。所以像矢量这样的东西会增加尺寸,但永远不会减少......
这是正确的理解 - 是python如何管理内存?
我试过谷歌搜索主题,但大多数帖子建议使用gc.collect()
- 从而暗示它应该是可能的吗?..