Question

问题：：我有两个经过预训练的模型，它们的变量 W1，b1 和 W2，b2 保存为 numpy 数组。

我想将这两个经过预训练的模型的混合物设置为模型的变量，并且仅在训练过程中更新混合物权重 alpha1 和 alpha2 。

为此，我创建了两个变量 alpha1 和 alpha2 并加载numpy数组并创建混合节点： W_new，b_new

我想用 W_new 和 b_new 替换计算图中的W和b，然后只训练 alpha1 和 alpha2 < / em>参数由opt.minimize(loss, var_list= [alpha1, alpha2])。

我不知道如何在计算图中替换W_new和b_new。我尝试分配tf.trainable_variables()[0] = W_new，但这不起作用。

如果有人能给我一些提示，我将不胜感激。

注1：我不想为W和b赋值（这将使图形与 alpha1 和 alpha2 断开连接），我希望参数混合成为图表的一部分。

注释2：您可能会说可以使用新变量来计算y，但是问题是，此处的代码只是一个玩具示例，用于简化操作。实际上，我不是线性回归而是几个 bilstms 和 crf 。因此，我无法手动计算公式。我必须在图中替换这些变量。

import tensorflow as tf import numpy as np np.random.seed(7) tf.set_random_seed(7) #define a linear regression model with 10 params and 1 bias with tf.variable_scope('main'): X = tf.placeholder(name='input', dtype=float) y_gold = tf.placeholder(name='output', dtype=float) W = tf.get_variable('W', shape=(10, 1)) b = tf.get_variable('b', shape=(1,)) y = tf.matmul(X, W) + b #loss = tf.losses.mean_squared_error(y_gold, y) #numpy matrices saved from two different trained models with the exact same architecture W1 = np.random.rand(10, 1) W2 = np.random.rand(10, 1) b1 = np.random.rand(1) b2 = np.random.rand(1) with tf.variable_scope('mixture'): alpha1 = tf.get_variable('alpha1', shape=(1,)) alpha2 = tf.get_variable('alpha2', shape=(1,)) W_new = alpha1 * W1 + alpha2 * W2 b_new = alpha1 * b1 + alpha2 * b2 all_trainable_vars = tf.trainable_variables() print(all_trainable_vars) #replace the original W and b with the new mixture variables in the computation graph (**doesn't do what I want**) all_trainable_vars[0] = W_new all_trainable_vars[1] = b_new #this doesn't work #note that I could just do the computation for y using the new variables as y = tf.matmul(X, W_new) + b_new #but the problem is this is just a toy example. In real world, my model has a big architecture with several #bilstms whose variables I want to replace with these new ones. #Now what I need is to replace W and b trainable parameters (items 0 and 1 in all_trainable vars) #with W_new and b_new in the computation graph. with tf.Session() as sess: sess.run(tf.global_variables_initializer()) train_writer = tf.summary.FileWriter('./' + 'graph', sess.graph) #print(sess.run([W, b])) #give the model 3 samples and predict on them print(sess.run(y, feed_dict={X:np.random.rand(3, 10)}))

我为什么要这样做？

假设您有几个预先训练的模型（在不同的域中），但是您无权访问它们的任何数据。

然后，您会从另一个域中获得一些训练数据，这些数据并不能给您带来那么多的性能，但是如果您可以将模型与没有的数据一起训练，那么您将获得良好的性能。 / p>
假设数据以某种方式在经过训练的模型中表示出来，我们希望通过使用我们作为监督的少量标记数据来学习混合系数，以学习混合系数。

我们不想预训练任何参数，我们只想学习预训练模型的混合体。混合重量是多少？我们需要从我们的监督中了解到这一点。

更新1：

我意识到我可以在创建模型之前将模型的参数设置为：

model = Model(W_new, b_new)

但是正如我所说的，我的真实模型使用了几个 tf.contrib.rnn.LSTMCell 对象。因此，我需要提供LSTMCell类和新变量，而不是让它创建自己的新变量。所以现在的问题是如何设置LSTMCell的变量，而不是让它创建它们。我想我需要子类 LSTMCell类并进行更改。现在有什么简单的方法可以解决这个问题。也许我应该问这个新问题。

我想做什么：

W = tf.get_variable(...) b = tf.get_variable(...) cell_fw = tf.contrib.rnn.LSTMCell(W, b, state_is_tuple=True)

为此here创建了一个单独的问题，因为由于不同的原因它可能对其他人有用。

如何在Tensorflow的计算图中将变量替换为另一个变量？

0 个答案: