在python中按变量释放内存使用量

时间:2017-04-11 14:27:03

标签: python macos list numpy memory

我目前正在尝试将一些数据存储到.h5文件中,我很快意识到可能必须将我的数据存储到部分中,因为我无法在我的RAM中处理它。我开始使用numpy.array来压缩内存使用量,但这导致花费数天格式化数据。

所以我回去使用list,但让程序监视内存使用情况, 当它高于指定值时,将以numpy格式存储一个部分 - 这样另一个进程就可以加载它并使用它。这样做的问题是,我认为释放内存的是不释放内存。出于某种原因,即使我重置了变量并且变量del,内存也是一样的。为什么不在这里释放记忆?

import numpy as np
import os
import resource
import sys
import gc
import math
import h5py
import SecureString
import objgraph
from numpy.lib.stride_tricks import as_strided as ast

total_frames = 15
total_frames_with_deltas = total_frames*3
dim = 40
window_height = 5


def store_file(file_name,data):
    with h5py.File(file_name,'w') as f:
        f["train_input"] = np.concatenate(data,axis=1)

def load_data_overlap(saved):
    #os.chdir(numpy_train)
    print "Inside function!..."
    if saved == False:
        train_files = np.random.randint(255,size=(1,40,690,4))
        train_input_data_interweawed_normalized = []
        print "Storing train pic to numpy"
        part = 0
        for i in xrange(100000):
            print resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
            if resource.getrusage(resource.RUSAGE_SELF).ru_maxrss > 2298842112/10:
                print "Max ram storing part: " + str(part) + " At entry: " + str(i)
                print "Storing Train input"
                file_name = 'train_input_'+'part_'+str(part)+'_'+str(dim)+'_'+str(total_frames_with_deltas)+'_window_height_'+str(window_height)+'.h5'
                store_file(file_name,train_input_data_interweawed_normalized)
                part = part + 1             
                del train_input_data_interweawed_normalized
                gc.collect()
                train_input_data_interweawed_normalized = []
                raw_input("something")
            for plot in train_files:
                overlaps_reshaped = np.random.randint(10,size=(45,200,5,3))
                for ind_plot in overlaps_reshaped.reshape(overlaps_reshaped.shape[1],overlaps_reshaped.shape[0],overlaps_reshaped.shape[2],overlaps_reshaped.shape[3]): 
                    ind_plot_reshaped = ind_plot.reshape(ind_plot.shape[0],1,ind_plot.shape[1],ind_plot.shape[2])
                    train_input_data_interweawed_normalized.append(ind_plot_reshaped)
    print len(train_input_data_interweawed_normalized)

    return train_input_data_interweawed_normalized_print
#------------------------------------------------------------------------------------------------------------------------------------------------------------

saved = False
train_input = load_data_overlap(saved)

输出:

.....
223662080
224772096
225882112
226996224
228106240
229216256
230326272
Max ram storing part: 0 At entry: 135
Storing Train input
something
377118720
Max ram storing part: 1 At entry: 136
Storing Train input
something
377118720
Max ram storing part: 2 At entry: 137
Storing Train input
something

1 个答案:

答案 0 :(得分:0)

您需要明确强制垃圾收集,请参阅here

  

根据Python官方文档,您可以强制垃圾收集器使用gc.collect()

释放未引用的内存