Question

TL; DR：如何在不占用更多内存的情况下为Theano功能提供更多数据？

我遇到的问题是使用Theano在GPU上训练我的ML算法会导致GPU最终耗尽内存。我稍微偏离了教程，因为我的数据集太大而无法完全读入内存（这对于视频算法来说也是一个问题，对吧？），所以我只是通过Theano而不是使用索引输入和更新方案。直接运用ndarrays。

让我举一个我的意思的例子。在Theano的Logistic回归教程中，它说要做的事情是：

train_model = theano.function(
    inputs=[index],
    outputs=cost,
    updates=updates,
    givens={
        x: train_set_x[index * batch_size: (index + 1) * batch_size],
        y: train_set_y[index * batch_size: (index + 1) * batch_size]
    }
)

这需要将test_set_x和test_set_y加载到内存中，教程使用SharedVariable来存储完整的数据集。

对我来说，数据集 huge （许多千兆字节），这意味着它不能一次全部加载到内存中，所以我修改了我的直接采集数据，因此：

train_model = theano.function(
    inputs=[input, classes], 
    outputs=cost, 
    updates=updates
)

然后我做了一些看起来像这样的事情：

for count, data in enumerate(extractor):
    observations, labels = data
    batch_cost = train_model(observations, labels)
    logger.debug("Generation %d: %f cost", count, batch_cost)

我想我可能从根本上误解了如何正确地将数据传递到GPU而没有一些讨厌的python垃圾收集肮脏。看起来这只是在内部模型中占用越来越多的内存，因为在（大量）批次之后训练后，我得到这样的错误：

Error when tring to find the memory information on the GPU: initialization error
Error freeing device pointer 0x500c88000 (initialization error). Driver report 0 bytes free and 0 bytes total 
CudaNdarray_uninit: error freeing self->devdata. (self=0x10cbbd170, self->devata=0x500c88000)
Exception MemoryError: 'error freeing device pointer 0x500c88000 (initialization error)' in 'garbage collection' ignored
Fatal Python error: unexpected exception during garbage collection

如何在不占用更多内存的情况下为Theano功能提供更多数据？

Answer 1

如果数据集不适合内存，我们的想法是获取部分数据并在每次需要时加载。

如果您的数据不适合gpu内存，如经典lasagne教程中所示，您可以迭代部分数据集，称为minibatches

然后，如果您的数据不适合您的RAM，则需要在每次需要时加载小批量。最好的方法是在分析当前的一个（gpu工作）时，单独创建一个单独的进程加载下一个小批量（cpu工作）

你可以从AlexNet：
激励自己

在Theano中管理无法容纳RAM的训练集的正确方法是什么？

1 个答案: