Question

我在Python脚本中有一个函数，我多次调用它（https://github.com/sankhaMukherjee/NNoptExpt/blob/dev/src/lib/NNlib/NNmodel.py）：我已经为这个例子显着简化了函数。

def errorValW(self, X, y, weights):

    errVal = None

    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())

        nW = len(self.allW)
        W = weights[:nW] 
        B = weights[nW:]

        for i in range(len(W)):
            sess.run(tf.assign( self.allW[i], W[i] ))

        for i in range(len(B)):
            sess.run(tf.assign( self.allB[i], B[i] ))

        errVal = sess.run(self.err, 
            feed_dict = {self.Inp: X, self.Op: y})

    return errVal

我从另一个函数多次调用此函数。当我看到程序日志时，似乎此功能持续时间越来越长。显示部分日志：

21:37:12,634 - ... .errorValW ... - Finished the function [errorValW] in 1.477610e+00 seconds
21:37:14,116 - ... .errorValW ... - Finished the function [errorValW] in 1.481470e+00 seconds
21:37:15,608 - ... .errorValW ... - Finished the function [errorValW] in 1.490914e+00 seconds
21:37:17,113 - ... .errorValW ... - Finished the function [errorValW] in 1.504651e+00 seconds
21:37:18,557 - ... .errorValW ... - Finished the function [errorValW] in 1.443876e+00 seconds
21:37:20,183 - ... .errorValW ... - Finished the function [errorValW] in 1.625608e+00 seconds
21:37:21,719 - ... .errorValW ... - Finished the function [errorValW] in 1.534915e+00 seconds
... many lines later  
22:59:26,524 - ... .errorValW ... - Finished the function [errorValW] in 9.576592e+00 seconds
22:59:35,991 - ... .errorValW ... - Finished the function [errorValW] in 9.466405e+00 seconds
22:59:45,708 - ... .errorValW ... - Finished the function [errorValW] in 9.716456e+00 seconds
22:59:54,991 - ... .errorValW ... - Finished the function [errorValW] in 9.282923e+00 seconds
23:00:04,407 - ... .errorValW ... - Finished the function [errorValW] in 9.415035e+00 seconds

有没有其他人经历过这样的事情？这让我感到困惑......

修改：这仅供参考......

作为参考，该类的初始化程序如下所示。我怀疑result变量的图表的大小正在逐渐增加。当我尝试使用tf.train.Saver(tf.trainable_variables())保存模型时，我已经看到了这个问题，并且此文件的大小不断增加。我不确定我是否在以任何方式定义模型时犯了错误......

def __init__(self, inpSize, opSize, layers, activations):

    self.inpSize = inpSize
    self.Inp     = tf.placeholder(dtype=tf.float32, shape=inpSize, name='Inp')
    self.Op      = tf.placeholder(dtype=tf.float32, shape=opSize, name='Op')

    self.allW    = []
    self.allB    = []

    self.result  = None

    prevSize = inpSize[0]
    for i, l in enumerate(layers):
        tempW = tf.Variable( 0.1*(np.random.rand(l, prevSize) - 0.5), dtype=tf.float32, name='W_{}'.format(i) )
        tempB = tf.Variable( 0, dtype=tf.float32, name='B_{}'.format(i) )

        self.allW.append( tempW )
        self.allB.append( tempB )

        if i == 0:
            self.result = tf.matmul( tempW, self.Inp ) + tempB
        else:
            self.result = tf.matmul( tempW, self.result ) + tempB

        prevSize = l

        if activations[i] is not None:
            self.result = activations[i]( self.result )

    self.err = tf.sqrt(tf.reduce_mean((self.Op - self.result)**2))


    return

Answer 1

您正在会话上下文中调用tf.assign。每次执行errorValW函数时，这将继续向您的图表添加操作，从而在图表变大时减慢执行速度。根据经验，在数据上执行模型时应避免调用Tensorflow操作（因为这通常在循环内部，导致图形不断增长）。根据我的个人经验，即使您只是添加了一些＆＃34;在执行期间操作会导致极度放缓。

请注意，tf.assign与其他任何操作一样。您应该事先定义一次（创建模型/构建图形时），然后在启动会话后重复运行相同的操作。

我不知道您在代码段中想要实现的目标，但请考虑以下因素：

...
with tf.Session() as sess:
    sess.run(tf.assign(some_var, a_value))

可以替换为

a_placeholder = tf.placeholder(type_for_a_value, shape_for_a_value)
assign_op = tf.assign(some_var, a_placeholder)
...
with tf.Session() as sess:
    sess.run(assign_op, feed_dict={a_placeholder: a_value})

其中a_placeholder应与some_var具有相同的dtype /形状。我不得不承认我还没有测试过这个片段，所以如果有问题请告诉我，但这应该是正确的。

Tensorflow执行时间

1 个答案: