张量流估计器的精度和损失为零

时间:2018-11-08 08:21:26

标签: python tensorflow deep-learning tensorflow-datasets tensorflow-estimator

我的模型的准确性和损失评估为0。
全局步骤应为1625,但应为1。
acc和loss不应该等于0,因为它们彼此矛盾。

我的输入函数,keras估计器,train_and_evaluate如下:

def make_input_fn(addrs,labels,batch_size,mode):

 filename_dataset = tf.data.Dataset.from_tensor_slices((addrs,labels))     

 dataset = filename_dataset.apply(tf.contrib.data.map_and_batch(lambda 
 addrs, labels: tuple(tf.py_func(
    process, [addrs, labels], [tf.uint8, labels.dtype])),batch_size,

 num_parallel_batches=2,

 drop_remainder=False))
 if mode == tf.estimator.ModeKeys.TRAIN:
  num_epochs = None # indefinitely
  dataset = dataset.apply(tf.contrib.data.shuffle_and_repeat(buffer_size = 10000))
 else:
  num_epochs = 1
  dataset = dataset.repeat(num_epochs)

 dataset = dataset.prefetch(buffer_size=batch_size)
 images,labels = dataset.make_one_shot_iterator().get_next()
 images.set_shape([None,512,512,3])
 labels.set_shape([None,1])
 return images,labels

def keras_estimator(model_dir,config):
 base_model = Xception(weights='imagenet', include_top=False,input_shape = 
  (512,512,3),classes = 5)
 x = base_model.output
 x = GlobalAveragePooling2D()(x)

 x = Dense(1024, activation='relu')(x)
 x = Dropout(0.2)(x)
 x = Dense(256, activation='relu')(x)
 x = Dropout(0.2)(x)

 predictions = Dense(5, activation='softmax')(x)


 model = Model(inputs=base_model.input, outputs=predictions)


 for layer in base_model.layers:
   layer.trainable = False
 model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', 
       metrics=['acc'])


 estimator=tf.keras.estimator.model_to_estimator(keras_model=model,
      model_dir=model_dir,
      config=config)
 return estimator

def train_and_evaluate(model_dir):
 t_batch_size = 512
 e_batch_size = 64
 num_epochs = 25
 import pandas as pd
 df = pd.read_csv('/content/trainLabels.csv')
 from random import shuffle
 addrs = ['/content/train/train/' + str(df.iloc[i]['image']) + '.jpeg' for i 
 in range(len(df))]
 labels = df['level'].values.tolist()
 c = list(zip(addrs, labels))
 shuffle(c)
 addrs1, labels1 = zip(*c)
 train_addrs = addrs1[0 : int(0.9 * len(addrs))]
 train_labels = labels1[0 : int(0.9 * len(labels))]
 val_addrs = addrs1[ int(0.9 * len(addrs)) : ]
 val_labels = labels1[ int(0.9 * len(addrs)) : ]
 train_addrs = list(train_addrs)
 train_labels = list(train_labels)
 val_addrs = list(val_addrs)
 val_labels = list(val_labels)

 run_config = tf.estimator.RunConfig(save_checkpoints_secs=300)

 estimator = keras_estimator(model_dir,run_config)

 t_max_steps = (len(train_addrs) // t_batch_size) * num_epochs

 train_spec = tf.estimator.TrainSpec(input_fn = lambda : 
 make_input_fn(train_addrs,train_labels,
 t_batch_size,mode=tf.estimator.ModeKeys.TRAIN),max_steps = t_max_steps)

 eval_spec = tf.estimator.EvalSpec(input_fn = lambda : 
 make_input_fn(val_addrs,val_labels,
 e_batch_size,mode=tf.estimator.ModeKeys.EVAL),steps = 
 None,start_delay_secs=10,
    throttle_secs=300)


 tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

以下是日志文件:

  

INFO:tensorflow:在本地运行培训和评估   (非分布式)。 INFO:tensorflow:开始训练并评估循环。的   评估将在每个检查点之后进行。检查点频率为   根据RunConfig参数确定:save_checkpoints_steps无   或save_checkpoints_secs300。警告:tensorflow:从   :9:map_and_batch(来自   tensorflow.contrib.data.python.ops.batching)已弃用,并将   在将来的版本中删除。更新说明:使用   tf.data.experimental.map_and_batch(...)。警告:tensorflow:从   :12:shuffle_and_repeat(来自   tensorflow.contrib.data.python.ops.shuffle_ops)已弃用并将   在将来的版本中删除。更新说明:使用   tf.data.experimental.shuffle_and_repeat(...)。 INFO:tensorflow:正在调用   model_fn。 INFO:tensorflow:完成调用model_fn。   INFO:tensorflow:以WarmStartSettings暖启动:   WarmStartSettings(ckpt_to_initialize_from ='/ content / training / keras / keras_model.ckpt',   vars_to_warm_start ='。*',var_name_to_vocab_info = {},   var_name_to_prev_var_name = {})INFO:tensorflow:温暖从开始:   ('/content/training/keras/keras_model.ckpt',)   INFO:tensorflow:暖启动变量:密集/内核; prev_var_name:   不变的INFO:tensorflow:温暖的开始变量:密集/偏向;   prev_var_name:不变的INFO:tensorflow:暖启动变量:   density_1 / kernel; prev_var_name:不变的INFO:tensorflow:暖启动   变量:density_1 / bias; prev_var_name:不变   INFO:tensorflow:温暖的开始变量:密集_2 /内核; prev_var_name:   不变的INFO:tensorflow:温暖的开始变量:density_2 / bias;   prev_var_name:不变的INFO:tensorflow:暖启动变量:   亚当/迭代; prev_var_name:不变   INFO:tensorflow:暖启动变量:Adam / lr; prev_var_name:   不变的INFO:tensorflow:暖启动变量:Adam / beta_1;   prev_var_name:不变的INFO:tensorflow:暖启动变量:   亚当/ beta_2; prev_var_name:不变的INFO:tensorflow:暖启动   变量:亚当/衰变; prev_var_name:不变   INFO:tensorflow:暖启动变量:训练/亚当/变量;   prev_var_name:不变的INFO:tensorflow:暖启动变量:   培训/亚当/变量_1; prev_var_name:不变   INFO:tensorflow:热启动变量:training / Adam / Variable_2;   prev_var_name:不变的INFO:tensorflow:暖启动变量:   培训/亚当/ Variable_3; prev_var_name:不变   INFO:tensorflow:热启动变量:training / Adam / Variable_4;   prev_var_name:不变的INFO:tensorflow:暖启动变量:   培训/亚当/ Variable_5; prev_var_name:不变   INFO:tensorflow:热启动变量:training / Adam / Variable_6;   prev_var_name:不变的INFO:tensorflow:暖启动变量:   培训/亚当/ Variable_7; prev_var_name:不变   INFO:tensorflow:热启动变量:training / Adam / Variable_8;   prev_var_name:不变的INFO:tensorflow:暖启动变量:   培训/亚当/ Variable_9; prev_var_name:不变   INFO:tensorflow:热启动变量:training / Adam / Variable_10;   prev_var_name:不变的INFO:tensorflow:暖启动变量:   培训/亚当/ Variable_11; prev_var_name:不变   INFO:tensorflow:热启动变量:training / Adam / Variable_12;   prev_var_name:不变的INFO:tensorflow:暖启动变量:   培训/亚当/ Variable_13; prev_var_name:不变   INFO:tensorflow:热启动变量:training / Adam / Variable_14;   prev_var_name:不变的INFO:tensorflow:暖启动变量:   培训/亚当/ Variable_15; prev_var_name:不变   INFO:tensorflow:热启动变量:training / Adam / Variable_16;   prev_var_name:不变的INFO:tensorflow:暖启动变量:   培训/亚当/ Variable_17; prev_var_name:不变   INFO:tensorflow:创建CheckpointSaverHook。 INFO:tensorflow:图原为   最终确定。 INFO:tensorflow:正在运行local_init_op。 INFO:tensorflow:完成   运行local_init_op。 INFO:tensorflow:将0的检查点保存到   /content/training/model.ckpt。 INFO:tensorflow:保存1个检查点   进入/content/training/model.ckpt。 INFO:tensorflow:调用model_fn。   INFO:tensorflow:完成调用model_fn。 INFO:tensorflow:正在启动   评估:2018-11-05-13:21:17 INFO:tensorflow:图已完成。   INFO:tensorflow:还原参数   /content/training/model.ckpt-1 INFO:tensorflow:正在运行local_init_op。   INFO:tensorflow:已运行local_init_op。 INFO:tensorflow:已完成   在2018-11-05-13:22:08评估INFO:tensorflow:为   整体步骤1:acc = 0.0,global_step = 1,损失= 0.0   INFO:tensorflow:为全局步骤1保存'checkpoint_path'摘要   /content/training/model.ckpt-1 INFO:tensorflow:最后一步的损失:   没有。

1 个答案:

答案 0 :(得分:0)

我以前有这个问题。这是因为我为数据集指定了错误的目录。最终,张量流没有输入数据。我希望这有帮助。