Question

我正在使用fer +数据集训练张量流模型。 fer +分为训练，验证和测试分区。训练完模型后，我想创建一个新的分类器，然后在另一个数据集上再次训练整个模型。因此，我将模型构建如下：

def model(inputs, return_top=True):
    #.... Here I have several conv layers

    if return_top:
        output = tf.layers.dense(output, units=8, name='outputs')

    return output

with tf.variable_scope('model'):
    output_train = model(inputs_train)
    mse_train = cal_loss(output_train, labels_train) # This is a function that calculates the loss
    train_step = optimize(mse_train)    # This is a function that implements the optimizer

with tf.variable_scope('model', reuse=True):
    output_validation = model(inputs_validation)
    mse_validation = cal_loss(output_validation, labels_validation)

with tf.variable_scope('model', reuse=True):
    output_test = model(inputs_test)
    mse_test = cal_loss(output_test, labels_test)

# Now I defined the rest of the model to be used later in fine tuning. 
with tf.variable_scope('model', reuse=True):
    output_sewa_train = model(sewa_inputs_train, return_top=False)

output_sewa_train = tf.layers.dense(output_sewa_train, units=2, name='output_sewa_train')
mse_sewa_train = cal_loss(output_sewa_train, sewa_labels_train)
sewa_train_step = optimizer_2(mse_sewa_train)

with tf.variable_scope('model', reuse=True):
    output_sewa_valid = model(sewa_inputs_valid, return_top=False)

output_sewa_valid = tf.layers.dense(output_sewa_valid, units=2, name='output_sewa_valid')

现在要训练模型，我有：

fine_tune_model = False

if not fine_tune_model:
    sess.run(train_step) # etc... To train the model
else:
    sess.run(sewa_train_step) # etc... to finetune the model...

以下是我的模型的图像：

请注意，我将数据存储在tfrecords中，并且正在使用tf.data.Dataset ...

现在这是问题所在：

当fine_tune_model = False时，模型成功运行。而当fine_tune_model = True中断时，出现此错误：

FailedPreconditionError (see above for traceback): GetNext() failed because the iterator has not been initialized. Ensure that you have run the initializer operation for this iterator before getting the next element.
 [[Node: train_dataset/train_data = IteratorGetNext[output_shapes=[[?,64,64], [?,8]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](train_dataset/Iterator)]]
 [[Node: model_2/d_block_2/bottleneck_5/dropout_1/cond/dropout/Shape-0-0-VecPermuteNCHWToNHWC-LayoutOptimizer/_4621 = _HostRecv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_17286...tOptimizer", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

此外，在查看模型的哪些部分相互连接时，我得到了以下信息：

并且：

这使我确定sewa相关模型的输出独立于fer +数据集。

此外，我们可以看到总共有5个模型。主要的是名称model，它以fer +训练数据集为输入；第二个模型是model_1，它引用了model，并作为输入fer +验证数据集。第三，我们有model_2，它也引用了model，并采用了fer +模型的测试数据集。然后，我们有model_3使用sewa_train数据集，最后model_4使用sewa验证数据集。

此外，如果我尝试运行output_sewa_train或mse_sewa_train，代码将成功运行，但是当我运行优化程序时，我得到了错误。

另外，请注意，我一直在使用tf.data.Dataset将数据而不是占位符提供给模型，这使事情变得复杂。

总而言之，我不确定如何解决上述错误。任何帮助深表感谢！！

张量流中的模型微调

0 个答案: