内存利用率远高于应有的水平

时间:2019-05-01 13:38:56

标签: python pandas memory deep-learning pytorch

我正在使用一种简单的方法从图像中提取描述符,并将其保存到磁盘上的.csv文件中。我大约有1百万张图片,并且我的网络每张图片(float32返回512个功能。

因此,我估计在循环结束时我将拥有1e6 * 512 * 32/4 / 1e9 = 4.1GB。但是,我观察到它使用的内存多于两倍。

indexstring,而class_idint64,所以我认为这里不是罪魁祸首。

我已经尝试使用gc.collect(),但没有成功。您认为我的代码留下了引用吗?

这是方法:

def prepare_gallery(self, data_loader, TTA, pbar=False, dump_path=None):
    '''Compute embeddings for a data_loader and store it in model.
    This is required before predicting to a test set.
    New entries should be removed from data before calling this function
    to avoid inferring on useless images.
    data_loader: A linear loader containing the database that test is
    compared against.'''
    self.set_mode('valid')
    self.net.cuda()
    n_iter = len(data_loader.dataset) / data_loader.batch_size
    if pbar:
        loader = tqdm(enumerate(data_loader), total=n_iter)
    else:
        loader = enumerate(data_loader)
    # Run inference and get embeddings
    feat_list = []
    index_list = []
    class_list = []
    for i, (index, im, class_id) in loader:
        with torch.no_grad():
            feat = tta(self.net, im)
            # Returns something like np.random.random((32, 512))

        feat_list.extend(feat)
        index_list.extend(index)
        class_list.extend(class_id.item())

    if dump_path is not None:
        np.save(dump_path + '_ids', index_list)
        np.save(dump_path + '_cls', class_list)
        np.save(dump_path + '_feat', feat_list)

    return np.asarray(index_list), np.asarray(feat_list), np.asarray(class_list)

0 个答案:

没有答案