张量流中的模型微调

时间:2018-11-28 19:41:06

标签: python tensorflow finetunning

我正在使用fer +数据集训练张量流模型。 fer +分为训练,验证和测试分区。训练完模型后,我想创建一个新的分类器,然后在另一个数据集上再次训练整个模型。 因此,我将模型构建如下:

def model(inputs, return_top=True):
    #.... Here I have several conv layers

    if return_top:
        output = tf.layers.dense(output, units=8, name='outputs')

    return output

with tf.variable_scope('model'):
    output_train = model(inputs_train)
    mse_train = cal_loss(output_train, labels_train) # This is a function that calculates the loss
    train_step = optimize(mse_train)    # This is a function that implements the optimizer

with tf.variable_scope('model', reuse=True):
    output_validation = model(inputs_validation)
    mse_validation = cal_loss(output_validation, labels_validation)

with tf.variable_scope('model', reuse=True):
    output_test = model(inputs_test)
    mse_test = cal_loss(output_test, labels_test)

# Now I defined the rest of the model to be used later in fine tuning. 
with tf.variable_scope('model', reuse=True):
    output_sewa_train = model(sewa_inputs_train, return_top=False)

output_sewa_train = tf.layers.dense(output_sewa_train, units=2, name='output_sewa_train')
mse_sewa_train = cal_loss(output_sewa_train, sewa_labels_train)
sewa_train_step = optimizer_2(mse_sewa_train)

with tf.variable_scope('model', reuse=True):
    output_sewa_valid = model(sewa_inputs_valid, return_top=False)

output_sewa_valid = tf.layers.dense(output_sewa_valid, units=2, name='output_sewa_valid')

现在要训练模型,我有:

fine_tune_model = False

if not fine_tune_model:
    sess.run(train_step) # etc... To train the model
else:
    sess.run(sewa_train_step) # etc... to finetune the model...

以下是我的模型的图像: enter image description here

请注意,我将数据存储在tfrecords中,并且正在使用tf.data.Dataset ...

现在这是问题所在:

fine_tune_model = False时,模型成功运行。而当fine_tune_model = True中断时,出现此错误:

FailedPreconditionError (see above for traceback): GetNext() failed because the iterator has not been initialized. Ensure that you have run the initializer operation for this iterator before getting the next element.
 [[Node: train_dataset/train_data = IteratorGetNext[output_shapes=[[?,64,64], [?,8]], output_types=[DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](train_dataset/Iterator)]]
 [[Node: model_2/d_block_2/bottleneck_5/dropout_1/cond/dropout/Shape-0-0-VecPermuteNCHWToNHWC-LayoutOptimizer/_4621 = _HostRecv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_17286...tOptimizer", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]

此外,在查看模型的哪些部分相互连接时,我得到了以下信息:

enter image description here

并且:

enter image description here

这使我确定sewa相关模型的输出独立于fer +数据集。

此外,我们可以看到总共有5个模型。主要的是名称model,它以fer +训练数据集为输入;第二个模型是model_1,它引用了model,并作为输入fer +验证数据集。第三,我们有model_2,它也引用了model,并采用了fer +模型的测试数据集。然后,我们有model_3使用sewa_train数据集,最后model_4使用sewa验证数据集。

此外,如果我尝试运行output_sewa_trainmse_sewa_train,代码将成功运行,但是当我运行优化程序时,我得到了错误。

另外,请注意,我一直在使用tf.data.Dataset将数据而不是占位符提供给模型,这使事情变得复杂。

总而言之,我不确定如何解决上述错误。任何帮助深表感谢!!

0 个答案:

没有答案