我试图在tensorflow中创建输入管道以进行图像分类,因此我想制作批量图像和相应的标签。 Tensorflow文档建议我们可以使用tf.train.batch来进行批量输入:
train_batch, train_label_batch = tf.train.batch(
[train_image, train_image_label],
batch_size=batch_size,
num_threads=1,
capacity=10*batch_size,
enqueue_many=False,
shapes=[[224,224,3], [len(labels),]],
allow_smaller_final_batch=True
)
但是,如果我像这样输入图表,我认为会有问题:
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=train_label_batch, logits=Model(train_batch)))
问题是成本函数中的操作是否使图像及其相应的标签出列,或者它们单独返回?因此导致培训错误的图像和标签。
答案 0 :(得分:1)
为了保持图像和标签的顺序,您需要考虑几件事情。
让我们说我们需要一个能够为我们提供图像和标签的功能。
def _get_test_images(_train=False):
"""
Gets the test images and labels as a batch
Inputs:
======
_train : Boolean if images are from training set
random_crop : Boolean if random cropping is allowed
random_flip : Boolean if random horizontal flip is allowed
distortion : Boolean if distortions are allowed
Outputs:
========
images_batch : Batch of images containing BATCH_SIZE images at a time
label_batch : Batch of labels corresponding to the images in images_batch
idx : Batch of indexes of images
"""
#get images and labels
_,_img_names,_img_class,index= _get_list(_train = _train)
#total number of distinct images used for train will be equal to the images
#fed in tf.train.slice_input_producer as _img_names
img_path,label,idx = tf.train.slice_input_producer([_img_names,_img_class,index],shuffle=False)
img_path,label,idx = tf.convert_to_tensor(img_path),tf.convert_to_tensor(label),tf.convert_to_tensor(idx)
img_path = tf.cast(img_path,dtype=tf.string)
#read file
image_file = tf.read_file(img_path)
#decode jpeg/png/bmp
#tf.image.decode_image won't give shape out. So it will give error while resizing
image = tf.image.decode_jpeg(image_file)
#image preprocessing
image = tf.image.resize_images(image, [IMG_DIM,IMG_DIM])
float_image = tf.cast(image,dtype=tf.float32)
#subtracting mean and divide by standard deviation
float_image = tf.image.per_image_standardization(float_image)
#set the shape
float_image.set_shape(IMG_SIZE)
labels_original = tf.cast(label,dtype=tf.int32)
img_index = tf.cast(idx,dtype=tf.int32)
#parameters for shuffle
batch_size = BATCH_SIZE
min_fraction_of_examples_in_queue = 0.3
num_preprocess_threads = 1
num_examples_per_epoch = MAX_TEST_EXAMPLE
min_queue_examples = int(num_examples_per_epoch *
min_fraction_of_examples_in_queue)
images_batch, label_batch,idx = tf.train.batch(
[float_image,label,img_index],
batch_size=batch_size,
num_threads=num_preprocess_threads,
capacity=min_queue_examples + 3 * batch_size)
# Display the training images in the visualizer.
tf.summary.image('images', images_batch)
return images_batch, label_batch,idx
在这里,tf.train.slice_input_producer([_img_names,_img_class,index],shuffle=False)
是一个有趣的事情,看看如果你把shuffle=True
放在一起,它将协调所有三个数组。
第二件事是num_preprocess_threads
。只要您使用单线程进行出列操作,批次就会以确定的方式出现。但是不止一个线程会随机地对阵列进行洗牌。例如对于图像0001.jpg如果真实标签是1,你可能会得到2或4.一旦它出列,它就是张量形式。 tf.nn.softmax_cross_entropy_with_logits
这些张量不应该有问题。