Question

我正在尝试创建一个增量分类器，该分类器将针对包含n个类的数据进行训练，这些数据包含一定数量的时期，然后包含n + m类的特定数目的时期，然后包含n + m + k，等等，其中每个连续的类集都包含前一组作为子集。

为了做到这一点而无需训练模型，保存模型，手动编辑图形，重新训练，重复，我只是定义了对整个类进行分类所需要的所有权重，但是保持直到分类器引入分类器之前，未见分类的权重会冻结为0。

为此，我的策略是定义一个占位符，该占位符以布尔值数组的形式提供，该值定义某些给定的权重集是否可训练。

以下相关代码：

output_train = tf.placeholder(tf.int32, shape = (num_incremental_grps), name         = "output_train")
.
.
.
weights = []
biases = []
for i in range(num_incremental_grps):
    W = tf.Variable(tf.zeros([batch_size, classes_per_grp]),         
    trainable=tf.cond(tf.equal(output_train[i], tf.constant(1)),lambda: tf.constant(True), lambda: tf.constant(False)))
    weights.append(W)
    b = tf.Variable(tf.zeros([classes_per_grp]), trainable=tf.cond(tf.equal(output_train[i], 
    tf.constant(1)), lambda:tf.constant(True), lambda: tf.constant(False)))
    biases.append(b)

out_weights = tf.stack(weights, axis=1).reshape((batch_size, -1))
out_biases = tf.stack(biases, axis=1).reshape((batch_size, -1))
outputs = tf.identity(tf.matmul(inputs, out_weights) + out_biases, name='values')
.
.
.
# Will change this to an array that progressively updates as classes are added.
output_trainable = np.ones(num_incremental_grps, dtype=bool)
.
.
.
with tf.Session() as sess:
    init.run()
    for epoch in range(epochs):
        for iteration in range(iterations):
            X_batch, y_batch = batch.getBatch()
            fd={X: X_batch, y: y_batch, training: True, output_train: output_trainable}
            _, loss_val = sess.run([training_op, loss], feed_dict=fd)

这将返回错误消息

Using a 'tf.Tensor' as a Python `bool` is not allowed. Use `if t is not None:` instead of 
`if t:` to test if a tensor is defined,and use TensorFlow ops such as tf.cond to execute 
subgraphs conditioned on the value of a tensor.

我已经尝试过修改它，例如将初始占位符数据类型设置为tf.bool而不是tf.int32。我还尝试过像这样的权重/偏向将张量的一部分馈入“可训练的”参数中

W = tf.Variable(tf.zeros([batch_size, classes_per_grp]), trainable=output_variable[i])

但是我得到同样的错误信息。除了尝试使用完全不同的方法来更新可预测类的数量之外，我不确定如何从这里继续进行。任何帮助将不胜感激。

Answer 1

发生错误是因为tf.cond基于单个布尔值做出了决定，就像if语句一样。您想要的是根据张量的每个元素进行选择。

您可以使用tf.where来解决此问题，但是您将遇到另一个问题，即trainable不是可以在运行时修复的属性，它是定义的一部分变量如果某个变量将在某个时候（可能不是在开始时，而是肯定在以后）进行训练，那么它必须是trainable。

我建议采取更简单的方法：将output_train定义为tf.float32的数组

output_train = tf.placeholder(tf.float32, shape=(num_incremental_grps), name="output_train")

然后再将您的权重和变量与此向量相乘。

W = tf.Variable(...)
W = W * output_train
...

在要进行培训的位置提供1到output_train的值，否则提供0。

请注意也掩盖您的损失，以忽略不想要的通道的输出，因为即使事件现在总是输出0，这仍然可能影响您的损失。例如，

logits = ...
logits = tf.matrix_transpose(tf.boolean_mask(
  tf.matrix_transpose(logits ),
  output_train == 1))
loss = tf.nn.softmax_cross_entropy_with_logits_v2(logits=logits, labels=labels)

如何逐步训练越来越多的课程？

1 个答案: