如何在tensorflow中的不同图形之间共享/重用层/变量?

时间:2017-04-21 11:45:31

标签: tensorflow

我正在构建一些复杂的神经网络模型,其中2个网络共享一些层。我的实现是创建2个张量流图并在其间共享层/变量。然而,在创建网络的过程中发现了错误。

import tensorflow as tf
def create_network(self):
    self.state_tensor = tf.placeholder(tf.float64, [None, self.state_size], name="state")
    self.action_tensor = tf.placeholder(tf.float64, [None, self.action_size], name="action")
    self.actor_graph = tf.Graph()
    with self.actor_graph.as_default():
        print tf.get_variable_scope()
        state_h1 = tf.layers.dense(inputs=self.state_tensor, units=64, activation=tf.nn.relu, name="state_h1",
                                   reuse=True)
        state_h2 = tf.layers.dense(inputs=state_h1, units=32, activation=tf.nn.relu, name="state_h2", reuse=True)
        self.policy_tensor = tf.layers.dense(inputs=state_h2, units=self.action_size, activation=tf.nn.softmax,
                                             name="policy")

    self.critic_graph = tf.Graph()
    with self.critic_graph.as_default():
        print tf.get_variable_scope()
        state_h1 = tf.layers.dense(inputs=self.state_tensor, units=64, activation=tf.nn.relu, name="state_h1",
                                   reuse=True)
        state_h2 = tf.layers.dense(inputs=state_h1, units=32, activation=tf.nn.relu, name="state_h2", reuse=True)
        action_h1 = tf.layers.dense(inputs=self.action_tensor, units=64, activation=tf.nn.relu, name="action_h1")
        action_h2 = tf.layers.dense(inputs=action_h1, units=32, activation=tf.nn.relu, name="action_h2")
        fc = tf.layers.dense(inputs=[state_h2, action_h2], units=32, activation=tf.nn.relu,
                             name="fully_connected")
        self.value_tensor = tf.layers.dense(inputs=fc, units=1, activation=None, name="value")

错误讯息在这里:

Traceback (most recent call last):
<tensorflow.python.ops.variable_scope.VariableScope object at 0x1101c3790>
  File "/Users/niyan/code/routerRL/test.py", line 16, in <module>
    model = DPGModel(state_dim, action_dim)
  File "/Users/niyan/code/routerRL/DPGModel.py", line 10, in __init__
    self.create_network()
  File "/Users/niyan/code/routerRL/DPGModel.py", line 37, in create_network
    reuse=True)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/layers/core.py", line 216, in dense
    return layer.apply(inputs)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 303, in apply
    return self.__call__(inputs, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 269, in __call__
    self.build(input_shapes[0])
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/layers/core.py", line 123, in build
    trainable=True)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 988, in get_variable
    custom_getter=custom_getter)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 890, in get_variable
    custom_getter=custom_getter)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 341, in get_variable
    validate_shape=validate_shape)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 258, in variable_getter
    variable_getter=functools.partial(getter, **kwargs))
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 208, in _add_variable
    trainable=trainable and self.trainable)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 333, in _true_getter
    caching_device=caching_device, validate_shape=validate_shape)
  File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 657, in _get_single_variable
    "VarScope?" % name)
ValueError: Variable state_h1/kernel does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?

因此,您似乎无法重用其他图表中的张量流。然而,它们确实是2个网络,最好被视为单独的图形。有什么建议吗?

1 个答案:

答案 0 :(得分:0)

执行您要求的操作的简单方法是使用占位符初始化权重变量。您可以通过将权重提供给占位符来共享图形之间的权重。以下示例似乎有效:

import tensorflow as tf
import numpy as np

weights_ph = tf.placeholder(tf.float32, [10])
weights = tf.Variable(weights_ph)
loss = -tf.reduce_sum(weights)
train_op = tf.train.GradientDescentOptimizer(0.001).minimize(loss)

with tf.Session() as sess:
    pyweights = np.random.normal(size=[10])
    tf.global_variables_initializer().run(feed_dict={weights_ph: pyweights})

    while True:
        pyloss, pyweights = sess.run([loss, weights, train_op], feed_dict={weights_ph: pyweights})[:2]
        print(pyloss)

请注意,您必须在tf.global_variable_initializer()。run()期间提供初始权重,否则它会抱怨并产生错误。