Question

在while_loop的每一步中，我想更新一个0.5 GB的变量。我无法避免循环，因为每个迭代都取决于先前的迭代。我的程序需要运行while循环1亿次。

在这种情况下，为了测试tf.while的性能，我进行了测试。这里的更新只是在变量中添加一个常量。

但是，即使这个简单的循环也需要24秒，并且需要4倍1 GB内存。我怀疑循环正在不断尝试重新分配1 GB的内存块，这在GPU上实在是太慢了。 GPU有4 GB内存，当我将变量设置为2 GB时，我得到了oom。

是否可以避免重新分配？

我可以将x用作循环变量，而不要使用tf.control_dependencies。但这会占用更多的内存。

tf.contrib.compiler.jit.experimental_jit_scope导致oom。

谢谢。

测试：

import tensorflow as tf
import numpy as np
from functools import partial
from timeit import default_timer as timer

def body1(x, i):
    a = tf.assign(x, x + 0.001)
    with tf.control_dependencies([a]):
        return i + 1


def make_loop1(x, end_ix):
    i = tf.Variable(0, name="i", dtype=np.int32)
    cond = lambda i2: tf.less(i2, end_ix)
    body = partial(body1, x)
    return tf.while_loop(
        cond, body, [i], back_prop=False,
        parallel_iterations=1)


def main():
    N = int(1e9 / 4)
    x = tf.get_variable('x', shape=N, dtype=np.float32,
                        initializer=tf.ones_initializer)

    end_ix = tf.constant(int(1000), dtype=np.int32)
    loop1 = make_loop1(x, end_ix)

    init_op = tf.global_variables_initializer()
    with tf.Session() as sess:
        sess.run(init_op)
        print("running_loop1")
        st = timer()
        sess.run(loop1)
        en = timer()
        print(st - en)
        print(sess.run(x[0]))

main()

避免在Tensorflow While_loop中重新分配内存

0 个答案: